12 Factor Agents Reference
A quick reference for applying 12 Factor Agents principles to multi-agent system design. Based on the 12 Factor Agents framework.
The 12 Factors
| # | Factor | Principle | Primary Skill |
|---|---|---|---|
| 1 | Natural language to tool calls | LLMs transform sentences into structured JSON | agent-specification |
| 2 | Own your prompts | Don't let frameworks abstract away prompts | agent-specification |
| 3 | Own your context building | Everything good is context engineering | coordination-patterns |
| 4 | Tools are structured outputs | Tool-use = JSON + deterministic code | agent-specification |
| 5/6 | Unified state | Execution + business state together | coordination-patterns |
| 7 | Contact humans with tools | HITL as first-class tool pattern | agent-specification |
| 8 | Own your control flow | Don't let LLM control the DAG | coordination-patterns |
| 9 | Compact errors into context | Feed errors back for self-correction | production-readiness |
| 10 | Small focused agents | Focused prompts beat long autonomous runs | mas-decision-gate |
| 11 | Trigger from anywhere | Channel-agnostic agent triggering | production-readiness |
| 12 | Stateless reducers | (state, event) -> new_state | coordination-patterns |
Factor Quick Checks
Use this checklist when designing agents:
Specification Phase (Factors 1, 2, 4, 7)
- F1: Tool call schemas explicitly defined (JSON input/output)
- F2: Prompts visible and version-controlled (not hidden in framework)
- F4: Tools demystified as JSON + code (call, execution, result schemas)
- F7: Human contact tool included if high-stakes decisions involved
Coordination Phase (Factors 3, 5/6, 8, 12)
- F3: Context building explicit and owned (prompt + RAG + memory + history)
- F5/6: Execution and business state unified (launch/pause/resume)
- F8: Control flow owned by code, not LLM (break/switch/summarize/judge)
- F12: State transitions as pure reducers (replay, debug, test)
Decision Phase (Factor 10)
- F10: Verified that single agent can't solve the problem
- F10: Each agent in MAS is small and focused
- F10: Complexity level appropriate (Level 0-4 progression)
Operations Phase (Factors 9, 11)
- F9: Error context manager implemented (spin-out prevention)
- F9: Error counters per tool (max 3 attempts default)
- F11: Trigger interface supports required channels
- F11: Response routing configured per channel
Factor Details
Factor 1: Natural Language to Tool Calls
Principle: LLMs transform natural language into structured tool calls (JSON).
Key insight: The "magic" is in reliable structured output generation. Specify schemas explicitly.
Implementation: Define tool call schemas with exact JSON format for inputs and outputs.
Factor 2: Own Your Prompts
Principle: LLMs are pure functions (tokens in -> tokens out). Don't let frameworks abstract away prompts.
Key insight: The prompt IS the agent specification. Version them together.
Anti-pattern: "Let the framework handle prompts"
Checklist:
- System prompt visible, not hidden in framework
- Prompt changes require code review
- A/B testing capability for prompt variants
Factor 3: Own Your Context Building
Principle: Everything that makes agents good is context engineering.
Context components:
- System prompt (agent identity)
- RAG results (relevant knowledge)
- Memory (past experiences)
- Agentic history (workflow context)
- Task input (current request)
Key insight: If you don't understand what happens at the token level, you miss optimization opportunities.
Factor 4: Tools Are Structured Outputs
Principle: "Tool-Use" is just JSON output + deterministic code execution.
Tool specification includes:
- Call Schema (what agent outputs)
- Execution (what code does)
- Result Schema (what feeds back)
Key insight: Demystify tools. Nothing magical about them.
Factor 5/6: Unified Execution and Business State
Principle: Enable Launch/Pause/Resume with simple APIs.
Unified state includes:
- Execution state: current step, next step, waiting status, retry config
- Business state: messages, tool calls, tool results, decisions made
Benefits: Pause anywhere, resume exactly, debug easily, replay possible.
Factor 7: Contact Humans with Tools
Principle: Human-in-the-loop is a first-class pattern.
When to use:
- High-stakes decisions
- Ambiguous requirements
- Compliance/approval workflows
- Error recovery beyond agent capability
Implementation: request_human_input tool with urgency levels and timeout actions.
Factor 8: Own Your Control Flow
Principle: Don't let the LLM control the entire DAG.
Control flow operations:
- Break: Stop agent loop early
- Switch: Route to different agent
- Summarize: Compress context
- Judge: Evaluate quality
Key insight: Code-controlled DAG beats LLM-controlled DAG. Smaller focused prompts always win.
Factor 9: Compact Errors into Context Window
Principle: Feed errors back into context so agents can self-correct.
Spin-out prevention:
- Max 3 errors per tool
- Max 5 total errors
- Escalate after repeated failures
Key insight: Error counters prevent infinite retry loops.
Factor 10: Small Focused Agents
Principle: Smaller focused prompts with controlled context always beat long autonomous runs.
The progression:
Level 0: Deterministic workflow (no agent)
Level 1: Single focused agent
Level 2: Single agent with tools
Level 3: Minimal MAS (Planner->Executor->Verifier)
Level 4: Full MAS (when justified)
Key insight: Most tasks belong at Level 0-2. Only advance when evidence supports it.
Factor 11: Trigger from Anywhere
Principle: Meet users where they are. Agent triggering should be channel-agnostic.
Supported channels: Slack, Email, CLI, API, Webhook, Dashboard
Key insight: Same workflow, any channel. Response routes back to origin.
Factor 12: Stateless Reducers
Principle: Agent logic as pure functions: (state, event) -> new_state
Benefits:
- Replay: Feed same events, get same state
- Debugging: Inspect state at any point
- Testing: Pure functions are easy to test
- Time travel: Rollback by replaying subset of events
Cross-References
By Factor
| Factor | Primary File | Reference Files |
|---|---|---|
| F1, F2, F4, F7 | agent-specification/SKILL.md | spec-templates.md, common-mistakes.md |
| F3, F5/6, F8, F12 | coordination-patterns/SKILL.md | state-management.md |
| F9, F11 | production-readiness/SKILL.md | ops-runbook.md |
| F10 | mas-decision-gate/SKILL.md | decision-tree.md |
By Topic
| Topic | Factors | Files |
|---|---|---|
| Tool specification | F1, F4 | agent-specification/SKILL.md, common-mistakes.md |
| Prompt engineering | F2 | agent-specification/SKILL.md, common-mistakes.md |
| Context engineering | F3 | coordination-patterns/SKILL.md |
| State management | F5/6, F12 | coordination-patterns/SKILL.md, state-management.md |
| Human-in-the-loop | F7 | agent-specification/SKILL.md, spec-templates.md |
| Control flow | F8 | coordination-patterns/SKILL.md |
| Error handling | F9 | production-readiness/SKILL.md, ops-runbook.md |
| Simplicity | F10 | mas-decision-gate/SKILL.md |
| Multi-channel | F11 | production-readiness/SKILL.md, ops-runbook.md |
Quick Decision Tree
Starting a new agent project?
│
├─ Is it truly non-deterministic? ──No──> Use scripts/workflows (Level 0)
│
├─ Can single agent handle it? ───Yes──> Single agent (Level 1-2)
│ │
│ └─ Does it need tools? ─────Yes──> Single agent + tools (Level 2)
│
├─ Is verification critical? ─────Yes──> Minimal MAS (Level 3)
│ │ Planner -> Executor -> Verifier
│ └─ Do you own control flow? ─No──> Add code-controlled DAG (F8)
│
└─ Multiple domains required? ────Yes──> Full MAS (Level 4)
│ Only with evidence
└─ Apply ALL 12 factors
Further Reading
The 12 Factor Agents principles come from applying software engineering best practices (inspired by the 12-factor app methodology) to AI agent development.
Core insights:
- Demystify: Agents are code + prompts + tools. No magic.
- Own everything: Context, prompts, control flow, state.
- Design for operations: Errors, channels, debugging.
- Start simple: Single agent first, MAS only when justified.
Source: humanlayer/12-factor-agents