Goal
Enable non-blocking experiment execution so the agent can continue higher-level research work while a long benchmark is in flight.
Implementation sketch
- add a
background: booleanmode to the experiment runner - spawn the benchmark process and return immediately
- move metrics parsing and checks into an asynchronous completion path
- notify the agent with a follow-up turn when the run finishes
Why this belongs in meta
This page is about the research harness, not about Parameter Golf model design itself.
Benefits
- less idle time during long runs
- cleaner separation between experimentation and synthesis
- more opportunity to write or refine paper pages while the machine is busy