Tag: compression

Mar 20, 2026

papers

VQ-Logits

Paper note on compressing the language-model output bottleneck by replacing the full logits projection with a compact codebook.

Mar 19, 2026

frontiers

Learned Weight Codecs and Compressible Training

Frontier synthesis on the newest shift from handcrafted post-training formats toward training rules and learned representations that directly target compressible model weights.

Mar 19, 2026

papers

BackSlash

Paper note on integrating rate-constrained compression pressure directly into LLM training rather than treating compression only as a post-training step.

Mar 19, 2026

papers

Neural Weight Compression

Paper note on using a single learned neural codec to compress whole LLM-scale weight sets instead of relying only on handcrafted quantization formats.

Mar 19, 2026

papers

NuMuon

Paper note on making LLM training explicitly produce more low-rank, compressible weights by constraining Muon updates with a nuclear-norm budget.

Mar 19, 2026

papers

Getting Free Bits Back from Rotational Symmetries in LLMs

Paper note on exploiting weight-space symmetries with bits-back coding so some model bytes can be saved without changing predictions.

Mar 19, 2026

moonshots

Artifact-Native Training

Moonshot hypothesis that the model should be trained directly to become a good compressed artifact, not merely a good floating-point checkpoint.

Mar 19, 2026

moonshots

Compilerized Model Artifacts

Moonshot hypothesis that the best compact artifact may store a tiny generator plus latent construction tape and sparse corrections instead of mostly storing raw weight tensors.

Mar 19, 2026

moonshots

Exception Topology Codecs

Moonshot hypothesis that the shape of protected exceptions may matter more than the exact saliency ranking, because structured exception maps can compress better than irregular ones.

Mar 19, 2026

moonshots

Regenerated LM Head

Moonshot hypothesis that most vocabulary rows in the output head should be regenerated from compact descriptors and shared factors rather than stored directly.

Mar 19, 2026

moonshots

Symmetry-Transport Weights

Moonshot hypothesis that many apparently different tensors could be stored as one canonical prototype plus cheap transport maps instead of as separate weights.

Mar 19, 2026

papers

Fast Vocabulary Transfer

Paper note on shrinking and retargeting the tokenizer and embedding table to a domain so the model uses fewer vocabulary bytes and shorter sequences.

Mar 19, 2026

papers

Radio

Paper note on applying rate-distortion theory directly to language-model compression instead of treating bit allocation as a heuristic afterthought.

Mar 19, 2026

papers

ReALLM

Paper note on compressing language-model matrices into residual low-rank structure plus a shared neural decoder over vector-quantized latent representations.

Mar 19, 2026

frontiers

Entropy-Friendly Model Structure

Frontier synthesis on why repeated structure, clustered values, and regular exception patterns may matter more than nominal precision once the final artifact and metadata are counted.

Mar 19, 2026

hypotheses

Output-Head Compression

Hypothesis that compressing or restructuring the LM head can beat modest backbone improvements in compact language models.

Mar 19, 2026

ideas

Global Codebook Recursive Backbone

Hypothesis that one small learned codebook bank shared across repeated blocks can beat per-matrix quantization by amortizing metadata and aligning compression with shared-depth structure.

Mar 19, 2026

lanes

Quantization, Outliers, and Compression-Aware Training

The lane focused on reducing the gap between train-time weights and the final compressed artifact.

Mar 19, 2026

notes

The LM Head Is Part of the Compression Problem

Synthesis note on why vocabulary and output-projection choices can dominate compact-model tradeoffs earlier than expected.

Mar 19, 2026

papers

Extreme Compression via Additive Quantization

Paper note on AQLM and why codebook-style additive quantization becomes attractive once scalar low-bit methods start wasting error budget on the wrong directions.

Mar 19, 2026

papers

ClusComp

Paper note on clustering-based compression as a way to exploit weight structure and outlier concentration when uniform quantization gets brittle.

Parameter Golf Research Garden

Section Tree

Tag: compression

VQ-Logits

Learned Weight Codecs and Compressible Training

BackSlash

Neural Weight Compression

NuMuon

Getting Free Bits Back from Rotational Symmetries in LLMs

Artifact-Native Training

Compilerized Model Artifacts

Exception Topology Codecs

Regenerated LM Head

Symmetry-Transport Weights

Fast Vocabulary Transfer

Radio

ReALLM

Entropy-Friendly Model Structure

Output-Head Compression

Global Codebook Recursive Backbone

Quantization, Outliers, and Compression-Aware Training

The LM Head Is Part of the Compression Problem

Extreme Compression via Additive Quantization

ClusComp

Graph View