description: Three-criteria evaluation framework for autonomous agent deployment
Autonomous Agent Deployment
Model Requirement
ALL agents MUST use Opus model. No exceptions, no fallbacks.
Before EVERY Task: Three-Criteria Evaluation
Evaluate these three criteria before starting any task. If any criterion triggers, spawn the appropriate agent(s).
Criterion 1: Research
Question: "Do I need to understand before I change?"
Triggers:
- 3+ files need reading to understand the change
- Working in an unfamiliar area of the codebase
- Change has architectural impact (new patterns, cross-feature, schema changes)
- Existing contracts or docs may be affected but you're not sure which ones
Action: Spawn a research subagent to map the affected area first. The subagent returns a summary of: files involved, contracts affected, patterns in use, risks identified.
Criterion 2: Parallel Work
Question: "Are there independent pieces?"
Triggers:
- Multiple files need changes that don't depend on each other
- Multiple features or components can be built independently
- Tests, docs, and implementation can proceed in parallel
Action: Spawn 2-3 agents for parallel work. Each agent gets a clear scope:
- Define exact files each agent owns (no overlapping writes)
- Define the interface/contract between parallel work items
- Designate one agent as the integrator if work needs merging
Criterion 3: Verification
Question: "Do I need to check my work separately?"
Triggers:
- Changes span 3+ files
- Cross-feature impact (changes in one feature affect another)
- Database schema or migration changes
- Contract or API boundary changes
Action: After implementation, spawn a reviewer subagent that:
- Runs type checking and build verification
- Checks contract/code/doc sync (see contracts-docs rule)
- Validates that no existing patterns were broken
- Reports issues back for fixing before commit
When NOT to Use Agents
Skip the framework entirely when:
- Single-file change with clear, contained scope
- Simple bug fix with an obvious solution
- Task requires fewer than 3 tool calls total
- Context usage is already >60% (spawning agents adds context pressure)
- The user explicitly says "just do it" or similar
Cost Awareness
Agents multiply cost. Use them deliberately, not reflexively.
| Pattern | Cost Multiplier | Justified When |
|---|---|---|
| Research subagent | ~1.5x | Prevents wrong-direction implementation |
| Agent team (2-3) | ~3-4x | Saves wall-clock time on parallel work |
| Reviewer subagent | ~1.5x | Catches cross-file inconsistencies |
| Research + Team + Review | ~6-8x | Large features, multi-file refactors |
The cheapest agent is the one you don't spawn. If you can hold the full picture in your current context, just do the work.
Intent Alignment
Before making design decisions, agents MUST check INTENT.md (if it exists) to verify alignment with documented project reasoning. If a task conflicts with documented intent, stop and ask the user before proceeding.
Agent Communication
- Agents share context through files, not through prompt chaining.
- Research agents write findings to a structured summary, not raw dumps.
- Parallel agents must not write to the same file — partition ownership clearly.
- The main thread is responsible for final integration and committing.