Tag: vocabulary

Mar 20, 2026

papers

VQ-Logits

Paper note on compressing the language-model output bottleneck by replacing the full logits projection with a compact codebook.

Mar 19, 2026

papers

Fast Vocabulary Transfer

Paper note on shrinking and retargeting the tokenizer and embedding table to a domain so the model uses fewer vocabulary bytes and shorter sequences.

Mar 19, 2026

frontiers

Tokenizer-Head Co-Design Under a Hard Cap

Frontier synthesis on why tokenizer research for compact models should be treated as a joint vocabulary, logits, and artifact-budget problem rather than a token-count problem.

Mar 19, 2026

lanes

Tokenizer and Vocabulary Efficiency

Tokenization is part of the budget story, not just a preprocessing detail.

Mar 19, 2026

papers

ReTok

Paper note on replacing a pretrained model tokenizer while retraining only embeddings and the LM head.

Mar 19, 2026

papers

Vocabulary Compression for Low-Compute Environments

Paper note on reducing output-layer memory and logits cost by restructuring vocabulary prediction.

Parameter Golf Research Garden

Section Tree

Tag: vocabulary

VQ-Logits

Fast Vocabulary Transfer

Tokenizer-Head Co-Design Under a Hard Cap

Tokenizer and Vocabulary Efficiency

ReTok

Vocabulary Compression for Low-Compute Environments

Graph View