Parameter Golf leaves room for evaluation-time ingenuity. That means the right question is not only “what should we store?” but also “what useful computation can a compact model perform after it has been loaded?”

Core question

Can a smaller artifact with bounded extra refinement, planning, or reranking beat a larger static model that stores more knowledge directly?

Why this lane matters

Under a hard artifact cap, evaluation-time compute is one of the few remaining levers once compression has gone far enough. It is especially natural for compact recurrent models, where the same core block can be reused for both representation and refinement.

Central papers

Main patterns to watch

1. Iterative refinement

Run a compact model for multiple passes instead of storing a larger one.

See:

2. Recurrent reasoning

Use a shared block that can spend more compute on difficult cases without changing stored bytes.

See:

3. Planning or reranking

Use a small model to generate candidates, then spend extra compute choosing among them.

Practical constraints

  • extra inference steps still have to fit wall-clock limits
  • gains must survive realistic task distributions rather than only toy prompts
  • the method should remain reproducible and architecturally coherent, not a brittle pile of special cases

Most relevant questions

  • when is extra test-time compute better than extra stored depth?
  • which compact architectures can make best use of repeated refinement?
  • how much of the benefit comes from true reasoning versus simple reranking?
  • can evaluation-time compute compensate for smaller vocabularies or more aggressive compression?
Grangier, D., Katharopoulos, A., Ablin, P., & Hannun, A. (2024). Need a Small Specialized Language Model? Plan Early! arXiv Preprint arXiv:2402.01093. https://arxiv.org/abs/2402.01093
Wu, Y., Sun, Z., Li, S., Welleck, S., & Yang, Y. (2024). Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models. arXiv Preprint arXiv:2408.00724. https://arxiv.org/abs/2408.00724