Tag: recurrence

Mar 20, 2026

papers

Titans

Paper note on neural long-term memory that learns what to memorize at test time, extending evaluation-time compute beyond plain search.

Mar 19, 2026

moonshots

Role-State Recurrence

Moonshot hypothesis that repeated depth should specialize through a persistent internal role state rather than through stored layer-specific parameters.

Mar 19, 2026

hypotheses

Iterative Refinement over Stored Depth

Hypothesis that a smaller recurrent model with bounded extra evaluation-time refinement can beat a larger static artifact under the same storage cap.

Mar 19, 2026

hypotheses

Recurrent Wide Architecture

Concrete architecture hypothesis: use aggressive depth sharing to buy much more width, then spend leftover bytes on stability and selective precision.

Mar 19, 2026

ideas

Global Codebook Recursive Backbone

Hypothesis that one small learned codebook bank shared across repeated blocks can beat per-matrix quantization by amortizing metadata and aligning compression with shared-depth structure.

Mar 19, 2026

ideas

Norm-Only Phase Specialization

Hypothesis that shared-depth models can recover most layer-role specialization using only per-step RMSNorm and tiny channel gates, with almost no byte cost.

Mar 19, 2026

ideas

Token-Adaptive Recurrent Refinement

Hypothesis that a compact shared-depth model should spend extra inference-time passes only on uncertain positions, turning compute into quality more efficiently than storing more static depth.

Mar 19, 2026

experiments

RWA Breadth Experiment

A breadth-profile local test of the recurrent-wide-architecture idea: aggressive depth sharing plus width expansion under the artifact cap.

Parameter Golf Research Garden

Section Tree

Tag: recurrence

Titans

Role-State Recurrence

Iterative Refinement over Stored Depth

Recurrent Wide Architecture

Global Codebook Recursive Backbone

Norm-Only Phase Specialization

Token-Adaptive Recurrent Refinement

RWA Breadth Experiment

Graph View