Titans

(Behrouz et al., 2025)

Sources: arXiv:2501.00663 · alphaXiv overview

Core contribution

Titans proposes a family of sequence models that combine ordinary short-term attention with a learned long-term memory module that updates at test time. The central claim is not just that more inference compute helps, but that some of that compute can be spent on writing memory, not only on reranking or extra forward passes.

Why this matters for Parameter Golf

This paper sharpens evaluation-time compute in a way that the existing shelf did not cover well. The interesting idea is that a hard artifact cap may be partly offset by behavioral memory formation at evaluation time: instead of storing more capacity, the model can build temporary task-specific state while reading the sequence.

What to import

Evaluation-time compute can update memory, not just search over outputs.
A compact core plus a learned memory interface may be a cleaner compute-for-storage trade than only widening the static trunk.
Persistent memory and temporary memory should be thought of separately.

What not to over-import

Titans is a broad long-context architecture paper, not a challenge-ready recipe for a tiny artifact-constrained LM. Its memory module is more ambitious than what a tight runtime and code budget may tolerate. The durable import is the framing: test-time adaptation can be memory formation, not only decoding strategy.

Best synthesis links

Extends Inference Scaling Laws from “better test-time allocation” to “learned test-time memory.”
Gives sharper motivation to refinement loops as decompression by making the extra compute update hidden state rather than only output selection.
Pairs naturally with iterative refinement and recurrent wide architecture.

Parameter Golf translation

Titans suggests asking whether bounded evaluation-time passes should:

revise token predictions,
update a temporary memory state,
or do both.

For this challenge, the valuable question is not whether Titans as written fits, but whether a much smaller memory-writing mechanism could buy more than another round of static parameter storage.

Behrouz, A., Zhong, P., & Mirrokni, V. (2025). Titans: Learning to Memorize at Test Time. arXiv Preprint arXiv:2501.00663. https://arxiv.org/abs/2501.00663

Parameter Golf Research Garden

Section Tree

Titans

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Graph View

Table of Contents

Referenced by

Recent notes

Public Runs

History and Public Runs

Public Research Directions

Paper Index

The LM Head is a Gradient Bottleneck

Mamba-PTQ

Titans

Transformers are SSMs

Section Tree

Titans

Core contribution

Why this matters for Parameter Golf

What to import

What not to over-import

Best synthesis links

Parameter Golf translation

Related

Graph View

Table of Contents

Referenced by

Recent notes