Sparse Outlier Preservation

Hypothesis

A small, explicitly protected set of high-error or high-sensitivity parameters should buy a disproportionately large improvement in compressed quality relative to its byte cost. (Liao et al., 2025; Zhang et al., 2026)

Why this is plausible

pQuant argues that extremely low-bit models fail when all parameters are treated too uniformly.
ClusComp shows outliers increasingly dominate quantization difficulty in newer LLMs.
Our challenge has hard storage limits but often some residual headroom, making a tiny high-precision side channel attractive.

Candidate implementations

sparse fp16 residuals for the largest quantization errors
mixed-precision protection for selected tensors or rows
decoupled branch designs that keep most weights cheap but preserve a small sensitive subset

Risks

metadata overhead overwhelms the actual gain
protected weights help nominal loss but not final artifact score
selection heuristics overfit to a narrow data or model regime

Liao, B., Herold, C., Hashemi, S. H., Vasilev, S., Khadivi, S., & Monz, C. (2025). ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning. arXiv Preprint arXiv:2503.13089. https://arxiv.org/abs/2503.13089

Zhang, W., Liu, B., Hu, Y., Bai, X., Zhang, W., & Cui, B. (2026). pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training. arXiv Preprint arXiv:2602.22592. https://arxiv.org/abs/2602.22592

Parameter Golf Research Garden

Section Tree

Sparse Outlier Preservation

Hypothesis

Why this is plausible

Candidate implementations

Risks

Graph View

Table of Contents

Referenced by

Recent notes

Public Runs

History and Public Runs

Public Research Directions

Paper Index

The LM Head is a Gradient Bottleneck

Mamba-PTQ

Titans

Transformers are SSMs

Section Tree

Sparse Outlier Preservation

Hypothesis

Why this is plausible

Candidate implementations

Risks

Related

Graph View

Table of Contents

Referenced by

Recent notes