Radio

(Young, 2025)

Sources: arXiv:2505.03031 · alphaXiv overview

Core contribution

Radio reframes LLM quantization as an explicit rate-distortion optimization problem. Instead of picking one quantizer and then arguing about average bit-width, it asks how to allocate scarce bits so that the marginal distortion per extra bit is balanced across model components.

Why this matters for Parameter Golf

This is one of the cleanest papers for a hard-cap setting because Parameter Golf is already a rate-distortion problem in disguise. The artifact cap means the real question is not just “which quantizer is best?” but “which extra stored bytes recover the most post-roundtrip language-model quality?”

What to import

Bit allocation should be explicit. A byte spent on one tensor is a byte not spent elsewhere.
Average bit-width is too blunt. Two methods with the same nominal bits can have very different byte ROI once scales, grouping, and metadata are counted.
Grouping overhead is part of the objective. Small groups may improve distortion but can lose once extra side information is counted.

What not to over-import

Radio is still a quantization paper, not a full artifact-cap optimizer for every downstream codec. Its distortion proxy is useful, but it does not prove that the exact same ranking remains optimal after all packing, serialization, and challenge-specific execution constraints are applied.

Best synthesis links

Deepens Byte allocation beats average bit-width by giving that frontier an information-theoretic frame.
Pairs naturally with OWQ, which gives a concrete way to protect the most sensitive columns instead of distributing bits democratically.
Connects to ReALLM, where the true object being budgeted is no longer only weight precision but also residual structure and decoder capacity.

Parameter Golf translation

The practical lesson is to rank candidate exception paths by recovered quality per stored byte:

protected rows or columns
codebook size
group size
residual low-rank terms
any structured side channel

That framing is more likely to produce leaderboard-relevant decisions than comparing methods only by advertised low-bit precision.

Young, S. I. (2025). Radio: Rate-Distortion Optimization for Large Language Model Compression. arXiv Preprint arXiv:2505.03031. https://arxiv.org/abs/2505.03031

Parameter Golf Research Garden

Section Tree

Radio

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Graph View

Table of Contents

Referenced by

Recent notes

Public Runs

History and Public Runs

Public Research Directions

Paper Index

The LM Head is a Gradient Bottleneck

Mamba-PTQ

Titans

Transformers are SSMs

Section Tree

Radio

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Related

Graph View

Table of Contents

Referenced by

Recent notes