Experiment Policy

Decision function

EV = p(successful accepted record) × magnitude of improvement ÷ time-to-evidence

Not beauty. Not novelty. Not "the model likes it."

Promotion gates

Gate 1: Repro tolerance

Within a tight band of claimed score (±0.001 BPB) or discarded as "not understood."

Gate 2: Runtime

If training step-time increases materially (>5%), candidate must pay for it with proportional BPB gain immediately. At 86ms/step, 1ms overhead ≈ 0.006 BPB penalty.

Gate 3: Artifact

If bytes increase, the agent must explicitly state where they are recovered. No handwaving. Budget ceiling: 15.9 MB with margin.

Gate 4: Promotion to 3-seed

Single-seed must beat parent by ≥0.001 BPB before 3-seed compute is spent. Additionally:

Artifact under budget with margin
No legality issue
No significant eval-time blowup
No measurable training instability

Hard constraints

No multi-variable experiments unless reproductions are done
One hypothesis per branch
No branch survives without exact runtime and artifact accounting
No TTT branch survives unless the log proves score-first legality
No 3-seed run unless single-seed evidence crosses promotion threshold
No "clever rewrite" of the whole stack without explicit CEO approval
Every branch must be PR-packagable at all times

Branch naming

trackA/gepa-legal-ttt
trackA/gepa-legal-ttt-adamw
trackB/std414-repro
trackB/std414-legal-ttt
trackC/microdelta-bigramhash-16k
repro/pr414
repro/pr505
repro/pr508

Experiment report template

Every run must produce a structured report in 09_RESULTS/:

hypothesis: <one line>
parent_branch: <branch name>
exact_diff: <file:lines changed>
train_step_time_ms: <number>
eval_time_s: <number>
artifact_bytes: <number>
pre_quant_bpb: <number>
post_quant_bpb: <number>
legality_risk: none | low | high
recommendation: promote | kill | revise
notes: <free text>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment Policy

Decision function

Promotion gates

Gate 1: Repro tolerance

Gate 2: Runtime

Gate 3: Artifact

Gate 4: Promotion to 3-seed

Hard constraints

Branch naming

Experiment report template

FilesExpand file tree

04_EXPERIMENT_POLICY.md

Latest commit

History

04_EXPERIMENT_POLICY.md

File metadata and controls

Experiment Policy

Decision function

Promotion gates

Gate 1: Repro tolerance

Gate 2: Runtime

Gate 3: Artifact

Gate 4: Promotion to 3-seed

Hard constraints

Branch naming

Experiment report template