Neural Weight Compression

(Ryu et al., 2025)

Sources: arXiv:2510.11234 · alphaXiv overview

Core contribution

Neural Weight Compression treats model weights as a learned-compression modality in their own right. It uses a neural codec with importance-aware rate-distortion training to compress entire LLM-scale weight sets, aiming to outperform handcrafted scalar and vector quantizers at practical mid-range bitrates.

Why this matters for Parameter Golf

This is exactly the kind of paper that stresses our current prior. It suggests we may be overcommitted to handcrafted quantization formats when a learned codec could model weight distributions more effectively — especially when the target is final storage, not just arithmetic simplicity.

What to import

Weight compression can itself be a learned representation problem.
Importance-aware quality allocation inside a codec matters.
Mid-range bitrates may be where learned codecs first become truly competitive, not only the extreme tiny-bit regime.

What not to over-import

A neural codec only helps if decode overhead, shared-model cost, and final artifact accounting still make sense in the challenge. Learned codecs are not automatically good artifacts; they have to amortize their own machinery.

Best synthesis links

Strongly reinforces Compilerized model artifacts.
Pairs with ReALLM as evidence that model artifacts may evolve away from plain quantized tensors.
Supports Entropy-friendly model structure and Rate-distortion for artifact caps.

Parameter Golf translation

The main takeaway is not necessarily “use a neural codec now.” It is:

treat the checkpoint as compressible data, not sacred tensor layout
optimize rate-distortion where the real storage object lives
evaluate codec machinery by byte ROI after accounting for shared model overhead

Ryu, J., Kim, M., Shin, S., Choi, H. M., Oh, D., & Lee, J. (2025). Neural Weight Compression for Language Models. arXiv Preprint arXiv:2510.11234. https://arxiv.org/abs/2510.11234

Parameter Golf Research Garden

Section Tree

Neural Weight Compression

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Graph View

Table of Contents

Referenced by

Recent notes

Public Runs

History and Public Runs

Public Research Directions

Paper Index

The LM Head is a Gradient Bottleneck

Mamba-PTQ

Titans

Transformers are SSMs

Section Tree

Neural Weight Compression

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Related

Graph View

Table of Contents

Referenced by

Recent notes