Tag: tokenizer

Mar 19, 2026

moonshots

Regenerated LM Head

Moonshot hypothesis that most vocabulary rows in the output head should be regenerated from compact descriptors and shared factors rather than stored directly.

Mar 19, 2026

papers

Fast Vocabulary Transfer

Paper note on shrinking and retargeting the tokenizer and embedding table to a domain so the model uses fewer vocabulary bytes and shorter sequences.

Mar 19, 2026

frontiers

Tokenizer-Head Co-Design Under a Hard Cap

Frontier synthesis on why tokenizer research for compact models should be treated as a joint vocabulary, logits, and artifact-budget problem rather than a token-count problem.

Mar 19, 2026

hypotheses

Output-Head Compression

Hypothesis that compressing or restructuring the LM head can beat modest backbone improvements in compact language models.

Mar 19, 2026

ideas

Entropy-Weighted Vocabulary Rescue

Hypothesis that most head-side quantization damage is concentrated in a tiny set of difficult token rows, making row-level protection a better byte trade than uniform head precision.

Mar 19, 2026

ideas

Head-to-Depth Budget Swap

Hypothesis that shrinking tokenizer and LM-head burden, then reinvesting the saved bytes into a wider shared backbone, beats spending the same budget on a larger static head.

Mar 19, 2026

lanes

Tokenizer and Vocabulary Efficiency

Tokenization is part of the budget story, not just a preprocessing detail.

Mar 19, 2026

notes

The LM Head Is Part of the Compression Problem

Synthesis note on why vocabulary and output-projection choices can dominate compact-model tradeoffs earlier than expected.

Mar 19, 2026

notes

Tokenizer Efficiency

Concept note on why tokenization changes not just sequence length but the whole byte/compute story of compact language models.

Mar 19, 2026

papers

ReTok

Paper note on replacing a pretrained model tokenizer while retraining only embeddings and the LM head.

Mar 19, 2026

papers

Beyond Text Compression

Paper note on tokenizer evaluation across scales and why compression alone is not enough to rank tokenizers.

Parameter Golf Research Garden

Section Tree

Tag: tokenizer

Regenerated LM Head

Fast Vocabulary Transfer

Tokenizer-Head Co-Design Under a Hard Cap

Output-Head Compression

Entropy-Weighted Vocabulary Rescue

Head-to-Depth Budget Swap

Tokenizer and Vocabulary Efficiency

The LM Head Is Part of the Compression Problem

Tokenizer Efficiency

ReTok

Beyond Text Compression

Graph View