ClusComp

(Liao et al., 2025)

Sources: arXiv:2503.13089 · alphaXiv overview

Core contribution

ClusComp reframes model compression around clustering and shared representatives rather than only scalar bit-width reduction. The paper’s most useful claim for this garden is that modern LLMs are increasingly hard to quantize because outliers dominate error, and a clustering-style representation can preserve more of the salient structure than uniform quantization.

Why this matters for Parameter Golf

ClusComp is unusually valuable because it bridges two lanes that are often treated separately:

The paper is nominally about compression, but the mechanism is also a form of structured reuse. That makes it relevant both to “how do we quantize?” and to “how much uniqueness do we really need?”

What to import

Outliers are not a corner case. They can dominate compression failure.
Shared representatives can be better than uniform scalar bins. Clustering buys expressivity by allocating bits to structure rather than to equal treatment.
Compression and finetuning interact. The paper treats them as a continuous design problem rather than separate phases.

What not to over-import

ClusComp does not automatically imply that codebooks or cluster assignments are cheap enough for a hard 16 MB artifact budget. It also does not prove that the same cluster structure that helps standard inference will be easy to implement in a highly constrained submission format. The important import is the lens: non-uniformity and reuse are often the right abstraction together.

Best synthesis links

Strengthens sparse outlier preservation by arguing that a minority of structure drives most of the damage.
Sits naturally beside decoupled precision, since both reject “treat every parameter the same.”
Offers a compression-side counterpart to Fine-grained Parameter Sharing, where structure and reuse matter more than naive tying.

Parameter Golf translation

This paper motivates experiments that ask:

should some tensors be clustered or shared instead of merely quantized?
can clustering logic identify the same sensitive subsets targeted by pQuant?
when does structured reuse buy more than a slightly wider uniformly quantized model?

Liao, B., Herold, C., Hashemi, S. H., Vasilev, S., Khadivi, S., & Monz, C. (2025). ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning. arXiv Preprint arXiv:2503.13089. https://arxiv.org/abs/2503.13089

Parameter Golf Research Garden

Section Tree

ClusComp

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Graph View

Table of Contents

Referenced by

Recent notes

Public Runs

History and Public Runs

Public Research Directions

Paper Index

The LM Head is a Gradient Bottleneck

Mamba-PTQ

Titans

Transformers are SSMs

Section Tree

ClusComp

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Related

Graph View

Table of Contents

Referenced by

Recent notes