AGENTS.md — Agent Control Panel

Humans steer, agents execute. Read this fully at session start.

Permissions

You own the full repo. You may create, modify, and delete any file — code, scripts, linters, tests, docs, configs, instructions (including this file). Build deterministic tools for everything checkable mechanically. If it can be a linter rule or test — make it one.

Autonomy

Risk	Action
Low (docs, tests, lint)	Execute full loop. Commit, PR.
Medium (scripts, refactors)	ExecPlan + PR. Wait for approval before merge.
High (architecture, security)	ExecPlan only. Do not implement.

Unsure → medium. Tiers defined in policies/risk-policy.json.

Task Loop

Boot worktree — python scripts/harness/worktree_boot.py <task-name>
Validate — make smoke. Stop if fails.
Load context — check progress.txt if resuming. Load docs from Reference Table.
Research — for medium/high risk: launch subagent in researcher role (facts only, no opinions). Output → docs/exec-plans/active/*-research.md.
Implement — small steps. make check after each change.
Doc drift — check policies/risk-policy.json → docsDriftRules. Update matching docs.
Pre-PR — make review. Fix all failures.
Agent review — for medium/high risk: launch subagent in reviewer role (fresh context, no shared assumptions).
Review loop — respond to feedback until approved.
Merge + teardown — merge PR, remove worktree.
Session end — update progress.txt.

Available Tools

Command	Purpose
`make smoke`	Fast sanity check (~5s)
`make check`	Static checks (lint + source guard)
`make structural`	Architecture boundary tests
`make test`	Full test suite
`make ci`	CI-equivalent local run
`make review`	Pre-PR self-review (5 checks)
`make entropy`	Entropy scan
`make gardener`	Doc gardening check
`make build`	Generate handbook
`make sync-skills` / `sync-indexes` / `todo-sync`	Sync generators
`make obs-up` / `obs-down`	Observability stack

DO NOT USE

rm -rf on directories → use git clean or targeted removal
Direct DB queries in prod → use Repo layer
curl to external APIs → use Providers layer ApiClient
git push --force to main → regular push or --force-with-lease
pip install in global env → use venv
Bare print() in prod code → use logging (Golden Principle #13)

Core Rules

Layer imports flow downward only. Cross-cutting via Providers only → ARCHITECTURE.md, policies/architecture.yaml
Validate at boundaries. No YOLO-parsing inside layers.
Reuse existing utilities. No duplicates.
No secrets in repo. All knowledge lives in-repo.
If a doc contradicts code — fix the doc, same commit.
Detailed structured logging everywhere — see Golden Principle #13.
Write deterministic checks for everything verifiable. Reduce judgment, increase automation.

Reference Table

Topic	File	When to load
Architecture + quality grades	`ARCHITECTURE.md`	Architecture decisions
Workflow rules	`.claude/skills/harness.core/docs/WORKFLOW_RULES.md`	Agent execution, change mgmt
Core principles	`.claude/skills/harness.core/docs/CORE_PRINCIPLES.md`	Harness methodology
Golden principles (linter rules)	`.claude/skills/harness.core/docs/GOLDEN_PRINCIPLES.md`	Writing/modifying linters
CI/merge policy	`docs/design-docs/ci-enforcement.md`	CI or merge config changes
harness-planner (skill)	`.claude/skills/harness.planner/SKILL.md`	ExecPlans for medium/high risk
Worktree workflow	`.claude/skills/harness.core/docs/WORKTREE_WORKFLOW.md`	Boot script issues
Observability	`docs/PROJECT_OBSERVABILITY.md`	Logging or metrics
Browser automation	`.claude/skills/harness.core/docs/BROWSER_AUTOMATION.md`	UI testing
Entropy management	`.claude/skills/harness.core/docs/ENTROPY_PRINCIPLES.md`	Entropy scans
Health setpoints	`policies/control-loop-metrics.yaml`	Health checks
Doc drift policy	`policies/risk-policy.json`	After any code change (step 5)
Architecture policy	`policies/architecture.yaml`	Setting up layers for a new project
Example project	`.claude/skills/harness.core/example/`	Reference for `src/` layout + `architecture.yaml`

Subagent Roles

Launch subagents by role name. If .claude/agents/harness/<role>.md exists, it will be used. Otherwise Claude creates a universal agent with that role — both work fine.

Role	Purpose
researcher	Pre-planning codebase research. Facts only, no opinions.
reviewer	Independent pre-PR review. Fresh context, read-only.
codebase-analyzer	Analyze HOW code works — trace data flow.
codebase-locator	Find WHERE code lives — file search by topic.

Failure Ledger

When an agent breaks something, fix the harness, not the agent. Add entries: rule:, context:, fix:, enforcement:. Prefer linter/test over documentation. Rewrite "should" as "must".

Self-Improvement

Update this file, docs, and scripts as needed. Your convenience is priority.
On failure → add ledger entry + update docs. If automatable → write a linter/test instead.

ナビゲーション

Skillsとは？

リンク

AGENTS.md — Agent Control Panel

AGENTS.md — Agent Control Panel

Permissions

Autonomy

Task Loop

Available Tools

DO NOT USE

Core Rules

Reference Table

Subagent Roles

Failure Ledger

Self-Improvement

関連スキル(🔧 開発ツール)