name: experiment-design description: Use before changing a model, loss, optimizer, dataset pipeline, augmentation, or evaluation protocol for deep learning research work
Experiment Design
Turn a rough research idea into a falsifiable experiment plan before writing code.
Hard Gate
Do not change code, configs, or launch runs until these are clear:
- hypothesis
- baseline
- success metric
- dataset and split
- compute budget
Workflow
- Inspect the current baseline: code path, configs, last known results, open uncertainties.
- Ask clarifying questions one at a time.
- State the hypothesis in one sentence: "If we change X, metric Y should improve because Z."
- Propose 2-3 approaches and recommend one.
- Define the first experiment as the smallest test that could disprove the idea.
- Write an experiment card to
docs/experiments/specs/YYYY-MM-DD-<topic>.md. - Get user approval before moving to
experiment-planning.
Experiment Card
Every design doc should include:
- goal
- baseline to beat
- exact metric and selection rule
- train/val/test or benchmark split assumptions
- minimal code/config changes expected
- sanity checks before full training
- failure modes and what evidence would invalidate the idea
- next decision after the first run
Guardrails
- Do not hide evaluation changes inside modeling changes.
- Do not compare against a weak or mismatched baseline.
- Do not treat one lucky run as evidence.
- If the idea touches multiple subsystems, split it into separate experiments.