AGENTS.md — Agent Control Panel
Humans steer, agents execute. Read this fully at session start.
Permissions
You own the full repo. You may create, modify, and delete any file — code, scripts, linters, tests, docs, configs, instructions (including this file). Build deterministic tools for everything checkable mechanically. If it can be a linter rule or test — make it one.
Autonomy
| Risk | Action |
|---|---|
| Low (docs, tests, lint) | Execute full loop. Commit, PR. |
| Medium (scripts, refactors) | ExecPlan + PR. Wait for approval before merge. |
| High (architecture, security) | ExecPlan only. Do not implement. |
Unsure → medium. Tiers defined in policies/risk-policy.json.
Task Loop
- Boot worktree —
python scripts/harness/worktree_boot.py <task-name> - Validate —
make smoke. Stop if fails. - Load context — check
progress.txtif resuming. Load docs from Reference Table. - Research — for medium/high risk: launch subagent in researcher role (facts only, no opinions). Output →
docs/exec-plans/active/*-research.md. - Implement — small steps.
make checkafter each change. - Doc drift — check
policies/risk-policy.json→docsDriftRules. Update matching docs. - Pre-PR —
make review. Fix all failures. - Agent review — for medium/high risk: launch subagent in reviewer role (fresh context, no shared assumptions).
- Review loop — respond to feedback until approved.
- Merge + teardown — merge PR, remove worktree.
- Session end — update
progress.txt.
Available Tools
| Command | Purpose |
|---|---|
make smoke | Fast sanity check (~5s) |
make check | Static checks (lint + source guard) |
make structural | Architecture boundary tests |
make test | Full test suite |
make ci | CI-equivalent local run |
make review | Pre-PR self-review (5 checks) |
make entropy | Entropy scan |
make gardener | Doc gardening check |
make build | Generate handbook |
make sync-skills / sync-indexes / todo-sync | Sync generators |
make obs-up / obs-down | Observability stack |
DO NOT USE
rm -rfon directories → use git clean or targeted removal- Direct DB queries in prod → use Repo layer
curlto external APIs → use Providers layer ApiClientgit push --forceto main → regular push or--force-with-leasepip installin global env → use venv- Bare
print()in prod code → use logging (Golden Principle #13)
Core Rules
- Layer imports flow downward only. Cross-cutting via
Providersonly →ARCHITECTURE.md,policies/architecture.yaml - Validate at boundaries. No YOLO-parsing inside layers.
- Reuse existing utilities. No duplicates.
- No secrets in repo. All knowledge lives in-repo.
- If a doc contradicts code — fix the doc, same commit.
- Detailed structured logging everywhere — see Golden Principle #13.
- Write deterministic checks for everything verifiable. Reduce judgment, increase automation.
Reference Table
<!-- TODO: [HUMAN] Update paths after installing skills -->| Topic | File | When to load |
|---|---|---|
| Architecture + quality grades | ARCHITECTURE.md | Architecture decisions |
| Workflow rules | .claude/skills/harness.core/docs/WORKFLOW_RULES.md | Agent execution, change mgmt |
| Core principles | .claude/skills/harness.core/docs/CORE_PRINCIPLES.md | Harness methodology |
| Golden principles (linter rules) | .claude/skills/harness.core/docs/GOLDEN_PRINCIPLES.md | Writing/modifying linters |
| CI/merge policy | docs/design-docs/ci-enforcement.md | CI or merge config changes |
| harness-planner (skill) | .claude/skills/harness.planner/SKILL.md | ExecPlans for medium/high risk |
| Worktree workflow | .claude/skills/harness.core/docs/WORKTREE_WORKFLOW.md | Boot script issues |
| Observability | docs/PROJECT_OBSERVABILITY.md | Logging or metrics |
| Browser automation | .claude/skills/harness.core/docs/BROWSER_AUTOMATION.md | UI testing |
| Entropy management | .claude/skills/harness.core/docs/ENTROPY_PRINCIPLES.md | Entropy scans |
| Health setpoints | policies/control-loop-metrics.yaml | Health checks |
| Doc drift policy | policies/risk-policy.json | After any code change (step 5) |
| Architecture policy | policies/architecture.yaml | Setting up layers for a new project |
| Example project | .claude/skills/harness.core/example/ | Reference for src/ layout + architecture.yaml |
Subagent Roles
Launch subagents by role name. If .claude/agents/harness/<role>.md exists, it will be used. Otherwise Claude creates a universal agent with that role — both work fine.
| Role | Purpose |
|---|---|
| researcher | Pre-planning codebase research. Facts only, no opinions. |
| reviewer | Independent pre-PR review. Fresh context, read-only. |
| codebase-analyzer | Analyze HOW code works — trace data flow. |
| codebase-locator | Find WHERE code lives — file search by topic. |
Failure Ledger
When an agent breaks something, fix the harness, not the agent. Add entries: rule:, context:, fix:, enforcement:. Prefer linter/test over documentation. Rewrite "should" as "must".
Self-Improvement
- Update this file, docs, and scripts as needed. Your convenience is priority.
- On failure → add ledger entry + update docs. If automatable → write a linter/test instead.