Artifact Dropout

Moonshot

Train the model under randomized artifact bottlenecks, not just weight noise.

Examples:

some steps lose protected subsets
some force harsher codebook collapse
some remove exception paths entirely
some enforce extra sharing or shallower reconstruction

The goal is to make the model robust to the kinds of mutilation that real artifact compression causes.

Why this is outside the current prior

Standard noise injection and quantization-aware training usually simulate one degradation family. Artifact dropout simulates whole failure regimes of the final submission object.

Mechanism sketch

During training or late-stage adaptation, randomly sample artifact constraints such as:

no protected rows this step
reduced codebook capacity this step
stricter shared-depth regime this step
harsher clipping or coarser packing this step

The model learns to distribute competence so it does not rely too heavily on one fragile artifact path.

Why it might matter for Parameter Golf

Parameter Golf cares about surviving the exact compressed route, not looking elegant before export. Artifact dropout directly trains against fragility to that route.

Cheapest falsifier

simulate two or three compression failure modes during a short finishing phase
compare post-roundtrip robustness versus a standard finishing run

Kill it if robustness broadens but best-case final artifact quality falls too much.

What would make it real

narrower pre→post degradation gap
better robustness to export/config changes
survival under multiple compression paths without massive quality sacrifice

Parameter Golf Research Garden

Section Tree

Artifact Dropout

Moonshot

Why this is outside the current prior

Mechanism sketch

Why it might matter for Parameter Golf

Cheapest falsifier

What would make it real

Graph View

Table of Contents

Referenced by

Recent notes

Public Runs

History and Public Runs

Public Research Directions

Paper Index

The LM Head is a Gradient Bottleneck

Mamba-PTQ

Titans

Transformers are SSMs

Section Tree

Artifact Dropout

Moonshot

Why this is outside the current prior

Mechanism sketch

Why it might matter for Parameter Golf

Cheapest falsifier

What would make it real

Related

Graph View

Table of Contents

Referenced by

Recent notes