Sources: arXiv:2502.13179 · alphaXiv overview
Core contribution
PTQ1.61 tries to make true sub-2-bit post-training quantization practical by preserving salient structure with less overhead than earlier mixed-precision or exception-heavy methods. The key message is that post-training methods are not exempt from the outlier problem; they simply solve it with different constraints.
Why this matters for Parameter Golf
This paper matters because it shows that the same structural story appears even when training is held fixed. That is useful for the knowledge garden: it means the case for selective preservation is not just a quantization-aware-training artifact. Even pure export pipelines live or die by how they isolate what the cheap path cannot carry.
What to import
- Sub-2-bit PTQ is possible only when saliency is handled intelligently.
- Metadata overhead is a first-class metric.
- Preprocessing and representation choices can matter as much as the nominal bit-width.
What not to over-import
PTQ1.61 does not prove that any specific saliency heuristic will survive this challenge’s exact workload. It also does not eliminate the risk that protecting some structure helps a proxy benchmark more than the true target. The transferable lesson is that salient preservation must be byte-aware from the start.
Best synthesis links
- Supports sparse outlier preservation from the PTQ side rather than the training side.
- Complements MicroScopiQ on the systems implications of preserving structure.
- Sits between pQuant and QuaRot as a middle ground: protect saliency without necessarily rotating the whole representation or redesigning training.
Parameter Golf translation
PTQ1.61 suggests evaluating candidate export formats by asking:
- can the salient subset be identified cheaply?
- how much bookkeeping does protection require?
- is the resulting format still simpler and more size-efficient than widening the baseline model slightly?