Getting Free Bits Back from Rotational Symmetries in LLMs

(He et al., 2024)

Sources: arXiv:2410.01309 · alphaXiv overview

Core contribution

This paper shows that some model descriptions waste bits by redundantly encoding weight-space choices that are equivalent up to symmetry. By applying bits-back coding to rotational symmetries exposed by SliceGPT-style preprocessing, it recovers a few percent of model size essentially “for free.”

Why this matters for Parameter Golf

This is unusually valuable because it changes the question from “how do we distort the model less?” to “which bits are redundant even before distortion starts?” For a hard artifact cap, a free 3–5% size reduction can be a serious margin.

What to import

Symmetry is a storage opportunity.
Some bytes are redundant because the model has multiple equivalent descriptions.
Training-free savings matter when margins are thin.

What not to over-import

The demonstrated gains are modest, and the method depends on a specific symmetry-exposing setup. It is not proof that all compact artifacts have huge hidden symmetry reserves. The lasting lesson is the storage mindset: some bytes can be recovered without changing the model’s function at all.

Best synthesis links

Gives concrete support to Symmetry-transport weights.
Extends Rate-distortion for artifact caps with a “zero-distortion savings” angle.
Suggests an interesting complement to quantization: remove redundant description bits before fighting over distortion bits.

Parameter Golf translation

This suggests a practical new question:

what parts of our artifact are genuinely informative,
and what parts are only one arbitrary coordinate choice among equivalent ones?

That question may be especially powerful in shared or transformed architectures.

He, J., Flamich, G., & Hernández-Lobato, J. M. (2024). Getting Free Bits Back from Rotational Symmetries in LLMs. arXiv Preprint arXiv:2410.01309. https://arxiv.org/abs/2410.01309

Parameter Golf Research Garden

Section Tree

Getting Free Bits Back from Rotational Symmetries in LLMs

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Graph View

Table of Contents

Referenced by

Recent notes

Public Runs

History and Public Runs

Public Research Directions

Paper Index

The LM Head is a Gradient Bottleneck

Mamba-PTQ

Titans

Transformers are SSMs

Section Tree

Getting Free Bits Back from Rotational Symmetries in LLMs

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Related

Graph View

Table of Contents

Referenced by

Recent notes