(Behrouz et al., 2025)

Sources: arXiv:2501.00663 · alphaXiv overview

Core contribution

Titans proposes a family of sequence models that combine ordinary short-term attention with a learned long-term memory module that updates at test time. The central claim is not just that more inference compute helps, but that some of that compute can be spent on writing memory, not only on reranking or extra forward passes.

Why this matters for Parameter Golf

This paper sharpens evaluation-time compute in a way that the existing shelf did not cover well. The interesting idea is that a hard artifact cap may be partly offset by behavioral memory formation at evaluation time: instead of storing more capacity, the model can build temporary task-specific state while reading the sequence.

What to import

  • Evaluation-time compute can update memory, not just search over outputs.
  • A compact core plus a learned memory interface may be a cleaner compute-for-storage trade than only widening the static trunk.
  • Persistent memory and temporary memory should be thought of separately.

What not to over-import

Titans is a broad long-context architecture paper, not a challenge-ready recipe for a tiny artifact-constrained LM. Its memory module is more ambitious than what a tight runtime and code budget may tolerate. The durable import is the framing: test-time adaptation can be memory formation, not only decoding strategy.

Parameter Golf translation

Titans suggests asking whether bounded evaluation-time passes should:

  • revise token predictions,
  • update a temporary memory state,
  • or do both.

For this challenge, the valuable question is not whether Titans as written fits, but whether a much smaller memory-writing mechanism could buy more than another round of static parameter storage.

Behrouz, A., Zhong, P., & Mirrokni, V. (2025). Titans: Learning to Memorize at Test Time. arXiv Preprint arXiv:2501.00663. https://arxiv.org/abs/2501.00663