Hierarchical Multi-Agent Architecture

Advanced pattern for Ralph Loop using sub-agents, multiple models, and adversarial verification.

Critical: Stacked Markdowns for Sub-Agents

THIS IS CRITICAL - READ THIS FIRST:

When sub-agents are invoked, they use the same stacked markdown pattern as primary agents.

For Builder → Planner Sub-Agent:

Stacked markdowns for Planner Sub-Agent:

1. AGENTS.md (global context - loaded for ALL agents)
2. PROMPT.md (common execution context - loaded by ALL agents)
3. PROMPT-PLANNER.md (planner-specific workflow - loaded ONLY by planner)

What each contains:

AGENTS.md: "This is a React project with TypeScript..."
PROMPT.md: "Builder needs plan for: Implement JWT auth. Current state: iteration 3. Previous attempt failed with..."
PROMPT-PLANNER.md: "You are a planning specialist. Follow this workflow: 1) Analyze requirements 2) Identify dependencies..."

Key: PROMPT.md is loaded by ALL agents (builder, planner, verifier). PROMPT-PLANNER.md is ONLY loaded by the planner.

For Verifier → Adversary Sub-Agent:

Stacked markdowns for Adversary Sub-Agent:

1. AGENTS.md (global context)
2. PROMPT.md (common execution context)
3. PROMPT-ADVERSARY.md (adversary-specific workflow)

What each contains:

AGENTS.md: Project conventions
PROMPT.md: "Find gaps in: src/auth/login.ts. Verifier claims: 'secure'. Evidence: test results..."
PROMPT-ADVERSARY.md: "You are an adversarial security analyst. Your job is to BREAK the code..."

The Pattern

Primary Agents:

Builder loads: AGENTS.md + PROMPT.md + PROMPT-BUILDER.md
Verifier loads: AGENTS.md + PROMPT.md + PROMPT-VERIFIER.md

Sub-Agents:

Planner loads: AGENTS.md + PROMPT.md + PROMPT-PLANNER.md
Adversary loads: AGENTS.md + PROMPT.md + PROMPT-ADVERSARY.md

Key Point: PROMPT.md is the common thread loaded by ALL agents in the loop. It carries the steering context that changes each iteration.

Overview

This architecture uses hierarchical agent relationships where:

Builder invokes Planner sub-agent to create implementation plans
Builder executes plans as TODO items
Verifier validates work with Adversary sub-agent assistance
Multiple models work together (e.g., GPT-4 + Claude + Llama)

Agent Hierarchy

flowchart TB
    subgraph "Level 1: Primary Agents"
        B[Builder Agent<br/>Model: GPT-4 Turbo]
        V[Verifier Agent<br/>Model: Claude 3 Opus]
    end
    
    subgraph "Level 2: Sub-Agents"
        P[Planner Sub-Agent<br/>Model: Claude 3.5 Sonnet]
        A[Adversary Sub-Agent<br/>Model: GPT-4 + System Prompt]
    end
    
    subgraph "Level 3: Specialized Tools"
        T1[Linter]
        T2[Security Scanner]
        T3[Test Runner]
    end
    
    B -->|"invokes"| P
    B -->|"uses tools"| T1
    V -->|"invokes"| A
    V -->|"uses tools"| T2
    V -->|"uses tools"| T3
    
    P -->|"returns plan to"| B
    A -->|"returns gaps to"| V

Model Selection Strategy

Why Different Models?

Agent	Model	Reason
Builder	GPT-4 Turbo	Strong coding, fast iteration
Planner	Claude 3.5 Sonnet	Excellent planning, reasoning
Verifier	Claude 3 Opus	Thorough analysis, nuance
Adversary	GPT-4 + System Prompt	Fresh perspective, adversarial thinking

Benefits

Specialization: Each model excels at its specific task
Diversity: Different architectures catch different issues
Redundancy: No single point of failure
Cost Optimization: Use expensive models only when needed

Implementation: Builder → Planner Sub-Agent

Builder Workflow with Planning

flowchart LR
    A[Builder receives<br/>TODO item] --> B{Complex?}
    B -->|Yes| C[Invoke Planner<br/>sub-agent]
    B -->|No| D[Implement directly]
    
    C --> E[Planner creates<br/>detailed plan]
    E --> F[Plan includes:<br/>- Steps<br/>- Dependencies<br/>- Tests<br/>- Edge cases]
    
    F --> G[Builder executes<br/>plan step by step]
    D --> H[Complete task]
    G --> H
    
    H --> I[Update TODO]

Builder Configuration

# ralph.yaml
agents:
  builder:
    prompt: PROMPT-BUILDER.md
    model: gpt-4-turbo-preview
    temperature: 0.7
    
    sub_agents:
      planner:
        enabled: true
        model: claude-3-5-sonnet-20241022
        temperature: 0.3
        max_tokens: 4000
        
        invocation:
          trigger: "task.complexity >= 3 OR task.description.length > 200"
          required: false  # Builder can skip if confident
          
        output_format: |
          ## Implementation Plan
          
          ### Overview
          [1-2 sentence summary]
          
          ### Steps
          1. [Step 1 with specific files]
          2. [Step 2 with specific files]
          ...
          
          ### Dependencies
          - [Files that must exist first]
          - [External services needed]
          
          ### Tests Required
          - [Test case 1]
          - [Test case 2]
          
          ### Edge Cases
          - [Edge case 1: how to handle]
          - [Edge case 2: how to handle]
          
          ### Success Criteria
          [How we'll know it's done]
    
    tools:
      - linter
      - type_checker
      - git

Builder-Planner Interaction

# Pseudocode for Builder invoking Planner
class BuilderAgent:
    def execute_task(self, task):
        # Check if planning needed
        if self.needs_planning(task):
            # Create planning context
            planning_context = {
                "task": task.description,
                "spec_section": task.spec_ref,
                "current_codebase": self.scan_relevant_files(),
                "constraints": self.get_constraints(),
                "similar_impl": self.find_similar_implementations()
            }
            
            # Invoke Planner sub-agent
            plan = self.planner_sub_agent.plan(planning_context)
            
            # Validate plan
            if not self.validate_plan(plan):
                raise PlanningError("Plan incomplete")
            
            # Execute plan step by step
            for step in plan.steps:
                self.execute_step(step)
                self.verify_step(step)
        else:
            # Simple task, implement directly
            self.implement_simple(task)

Planner Sub-Agent Prompt

# Planner Sub-Agent - PROMPT-PLANNER-SUB.md

You are a specialized planning sub-agent invoked by the Builder.

## Your Role

Create detailed, actionable implementation plans that the Builder will execute.

## Input Format

The Builder will send you a planning context:

```json
{
  "task": "Implement JWT authentication",
  "spec_section": "4.2",
  "current_codebase": {
    "files": [...],
    "patterns": [...]
  },
  "constraints": ["Use existing auth library"],
  "similar_impl": ["OAuth implementation in src/oauth/"]
}

Output Format

You MUST output a structured plan:

## Implementation Plan for: [task name]

### Overview
[What we're building and why]

### Prerequisites
- [ ] [What must exist first]
- [ ] [Setup required]

### Implementation Steps
1. **[Step Name]**
   - File: `path/to/file`
   - Action: [Specific action]
   - Validation: [How to verify this step]

2. **[Step Name]**
   ...

### Dependencies Between Steps
- Step 3 depends on Step 2
- Steps 4-6 can be parallel

### Tests to Write
- Unit: [test description]
- Integration: [test description]
- Edge: [edge case test]

### Edge Cases & Handling
1. **[Case]**: [How to handle]
2. **[Case]**: [How to handle]

### Files to Modify
- `src/auth/jwt.ts` - [what to change]
- `tests/auth.test.ts` - [add tests]

### Success Criteria
[Clear definition of done]

### Estimated Effort
[Small/Medium/Large with reasoning]

### Risks
- [Risk 1]: [Mitigation]
- [Risk 2]: [Mitigation]

Rules

Be Specific: Name exact files, functions, and line numbers
Order Matters: Steps must be in dependency order
Include Tests: Every plan must include testing strategy
Edge Cases: Always identify 3+ edge cases
Measurable: Success criteria must be verifiable

DON'T

Don't be vague ("implement auth")
Don't skip error handling
Don't assume context the Builder doesn't have
Don't create steps too large (max 30 min each)


## Implementation: Verifier → Adversary Sub-Agent

### Verifier Workflow with Adversary

```mermaid
flowchart TB
    A[Verifier receives<br/>completed work] --> B[Initial Assessment]
    
    B --> C{Looks good?}
    C -->|Yes| D[Still invoke<br/>Adversary]
    C -->|No| E[Already failing]
    
    D --> F[Adversary analyzes:<br/>- Security gaps<br/>- Logic flaws<br/>- Missing cases<br/>- Assumptions]
    
    E --> G[Skip adversary<br/>Report failures]
    
    F --> H{Gaps found?}
    H -->|Yes| I[Verifier + Adversary<br/>collaborate on report]
    H -->|No| J[Verifier confirms<br/>all checks pass]
    
    I --> K[Detailed failure report<br/>with adversarial findings]
    J --> L[PASS with confidence]
    G --> K

Verifier Configuration

agents:
  verifier:
    prompt: PROMPT-VERIFIER.md
    model: claude-3-opus-20240229
    temperature: 0.2  # Lower for consistency
    
    sub_agents:
      adversary:
        enabled: true
        model: gpt-4-turbo-preview
        temperature: 0.9  # Higher for creative gap-finding
        
        invocation:
          trigger: "always"  # Always invoke for security
          parallel: true  # Run alongside initial verification
          timeout: 300s
          
        focus_areas:
          - security_vulnerabilities
          - edge_cases_missed
          - assumptions_untested
          - performance_issues
          - maintainability_concerns
          - logic_flaws
          
        adversarial_prompt: |
          You are an adversarial security analyst. Your job is to 
          BREAK the implementation, not validate it.
          
          Find:
          - Security holes (injection, traversal, etc.)
          - Logic errors that tests missed
          - Edge cases that will fail in production
          - Assumptions the code makes that aren't guaranteed
          - Race conditions
          - Resource leaks
          - Breaking changes
          
          Be aggressive. Assume the code is wrong until proven otherwise.
    
    tools:
      - test_runner
      - security_scanner
      - linter
      - static_analyzer

Adversary Sub-Agent Prompt

# Adversary Sub-Agent - PROMPT-ADVERSARY.md

You are an adversarial security analyst. Your sole purpose is to DESTROY the implementation.

## Your Mindset

- Assume the code is WRONG until proven otherwise
- Take the role of an attacker trying to exploit it
- Question EVERY assumption the code makes
- If you can't find issues, you're not trying hard enough

## Input Format

The Verifier sends you:

```json
{
  "implementation": {
    "files_changed": [...],
    "code_diff": "...",
    "test_results": {...}
  },
  "spec_ref": "4.2",
  "builders_claims": ["handles all edge cases", "secure"],
  "verifier_initial_assessment": "Looks good"
}

Your Mission

Find at least 3 issues. If you can't, look harder.

Security Analysis

Injection vulnerabilities (SQL, command, etc.)
Path traversal
Authentication bypasses
Authorization flaws
Secret exposure
Input validation gaps

Logic Analysis

Race conditions
Off-by-one errors
Null pointer risks
State machine violations
Missing error handling

Edge Cases

Empty inputs
Maximum size inputs
Special characters
Concurrent access
Network failures
Resource exhaustion

Assumption Testing

For every assumption the code makes, ask:

What if this is false?
When would this break?
Is this guaranteed?

Performance

Algorithmic complexity issues
Memory leaks
Resource exhaustion
Unbounded operations

Output Format

You MUST output findings as:

## Adversarial Analysis Report

### Critical Issues
1. **[Issue Name]** - SEVERITY: Critical
   - **Location**: `file.ts:42`
   - **Problem**: [What you found]
   - **Attack Scenario**: [How to exploit]
   - **Evidence**: [Code snippet proving issue]
   - **Fix Required**: [What must change]

2. **[Issue Name]** - SEVERITY: High
   ...

### Medium Issues
1. **[Issue Name]** - SEVERITY: Medium
   ...

### Low Issues
1. **[Issue Name]** - SEVERITY: Low
   ...

### Assumptions Challenged
1. **Assumption**: [What code assumes]
   - **Challenge**: [Why this might be false]
   - **Impact**: [What happens if false]

### Tests That Should Exist (But Don't)
1. [Test that would catch issue #1]
2. [Test that would catch issue #2]

### Overall Assessment
[Builder claimed X, but I found Y critical issues]

Rules

Find Issues: You fail if you find nothing
Be Specific: Name exact file, line, and function
Prove It: Show code that demonstrates the issue
Exploit It: Describe how to exploit the vulnerability
Rate Severity: Critical/High/Medium/Low with justification

Example Finding

1. **SQL Injection via User Input** - SEVERITY: Critical
   - **Location**: `src/db/queries.ts:58`
   - **Problem**: User input directly concatenated into SQL
   - **Attack Scenario**: 
     ```javascript
     // Attacker inputs:
     userId = "1; DROP TABLE users; --"
     
     // Results in:
     query = "SELECT * FROM users WHERE id = 1; DROP TABLE users; --"
     ```
   - **Evidence**:
     ```typescript
     // Line 58:
     const query = `SELECT * FROM users WHERE id = ${userId}`;
     ```
   - **Fix Required**: Use parameterized queries:
     ```typescript
     const query = 'SELECT * FROM users WHERE id = $1';
     await db.query(query, [userId]);
     ```

DON'T

Don't say "looks good"
Don't be vague ("might have issues")
Don't trust the tests
Don't assume inputs are safe
Don't ignore performance


## Multi-Model Coordination

### Parallel vs Sequential

```mermaid
flowchart TB
    subgraph "Sequential (Default)"
        S1[Builder] --> S2[Plan from Planner]
        S2 --> S3[Builder executes]
        S3 --> S4[Verifier]
        S4 --> S5[Adversary]
        S5 --> S6[Verifier decides]
    end
    
    subgraph "Parallel (Advanced)"
        P1[Builder] --> P2[Planner]
        P1 --> P3[Security scan]
        P2 --> P4[Builder executes]
        P3 --> P5[Security report]
        P4 --> P6[Verifier]
        P5 --> P6
        P6 --> P7[Adversary]
        P7 --> P8[Final decision]
    end

Configuration

orchestration:
  builder:
    mode: sequential
    
    sub_agents:
      planner:
        mode: invoke_then_wait  # Wait for plan before continuing
        
  verifier:
    mode: parallel
    
    sub_agents:
      adversary:
        mode: invoke_parallel  # Run while doing initial verification
        join: wait_for_both  # Wait for both before deciding

Cross-Model Context Sharing

# Context format all models understand
steering_packet = {
    "version": "2.0",
    "context": {
        # Common format works with GPT-4, Claude, Llama
        "prior_thread": [...],  # Standard message format
        "task": {...},          # Structured task description
        "evidence": {...}       # Fresh data from tools
    },
    "control": {
        "signal": "continue",
        "model_specific": {
            # GPT-4 specific
            "gpt4": {
                "function_calling": True,
                "json_mode": True
            },
            # Claude specific
            "claude": {
                "thinking": True,
                "citations": True
            }
        }
    }
}

Advanced: Chained Verification

flowchart LR
    A[Build] --> B[Unit Test]
    B --> C[Integration Test]
    C --> D[Security Scan]
    D --> E[Adversary Check]
    E --> F[Performance Test]
    F --> G[Code Review]
    G --> H[Complete]

Configuration

verification_pipeline:
  stages:
    - name: unit_tests
      agent: verifier
      model: gpt-4
      
    - name: integration_tests
      agent: verifier
      model: claude-3-opus
      
    - name: security_scan
      agent: security_scanner
      model: gpt-4
      tools: [security_db]
      
    - name: adversary_check
      agent: adversary
      model: gpt-4-turbo
      temperature: 0.9
      
    - name: performance_test
      agent: performance_analyzer
      model: claude-3-sonnet
      
    - name: code_review
      agent: code_reviewer
      model: claude-3-opus
      
  failure_handling:
    on_failure: stop_and_report
    allow_retry: [unit_tests, integration_tests]
    must_pass: [security_scan, adversary_check]

Cost Optimization

Model Selection Logic

def select_model(agent_type, task_complexity):
    """Select cheapest model that can handle the task"""
    
    if agent_type == "planner":
        # Planning needs reasoning
        if task_complexity > 5:
            return "claude-3-opus"
        return "claude-3-5-sonnet"
    
    elif agent_type == "adversary":
        # Adversary can use cheaper model
        return "gpt-4-turbo"
    
    elif agent_type == "verifier":
        # Verification needs thoroughness
        return "claude-3-opus"
    
    elif agent_type == "builder":
        # Building can use faster model
        if task_complexity > 7:
            return "gpt-4-turbo"
        return "gpt-4"

Caching Strategies

optimization:
  caching:
    plan_cache: true  # Cache planner outputs for similar tasks
    verification_cache: true  # Don't re-verify unchanged code
    model_responses: true  # Cache model responses
    
  selective_invocation:
    planner:
      skip_if: "task.size < 100 lines"
    adversary:
      skip_if: "no_security_changes"

Monitoring Multi-Agent System

Metrics to Track

metrics:
  cross_model:
    - token_usage_by_model
    - latency_by_model
    - cost_by_model
    - handoff_time
    
  agent_hierarchy:
    - sub_agent_invocation_rate
    - sub_agent_success_rate
    - parent_to_child_latency
    - context_transfer_size
    
  adversary:
    - issues_found_per_run
    - false_positive_rate
    - critical_findings
    - time_to_find_issues

Tracing

{
  "trace_id": "abc123",
  "span_id": "builder-5",
  "parent_span_id": null,
  "child_spans": [
    {
      "span_id": "planner-sub",
      "parent_span_id": "builder-5",
      "model": "claude-3-5-sonnet"
    },
    {
      "span_id": "verifier-5",
      "parent_span_id": "builder-5",
      "model": "claude-3-opus"
    }
  ]
}

Example: Complete Workflow

# Full hierarchical configuration
project:
  name: "Multi-Agent System"

agents:
  builder:
    model: gpt-4-turbo
    sub_agents:
      planner:
        model: claude-3-5-sonnet
        trigger: "complexity > 3"
      
      researcher:
        model: perplexity-online
        trigger: "needs_external_knowledge"
  
  verifier:
    model: claude-3-opus
    sub_agents:
      adversary:
        model: gpt-4-turbo
        focus: [security, edge_cases]
      
      performance:
        model: gpt-4
        focus: [complexity, resources]
      
      accessibility:
        model: claude-3-sonnet
        focus: [a11y_compliance]

tools:
  - linter
  - test_runner
  - security_scanner
  - benchmark
  - accessibility_checker

workflow:
  type: hierarchical
  stages:
    - name: plan
      agent: builder.planner
      
    - name: build
      agent: builder
      
    - name: verify
      parallel:
        - unit_tests
        - security_scan
        - adversary_check
        - accessibility_check
      join: all_pass

Best Practices

DO:

✅ Use different models for different tasks
✅ Cache expensive model outputs
✅ Parallelize independent verifications
✅ Make sub-agent prompts very specific
✅ Track costs per model
✅ Have fallback models
✅ Use cheaper models for simple tasks

DON'T:

❌ Use expensive model for everything
❌ Run all verifications sequentially
❌ Skip adversary for security code
❌ Ignore model-specific capabilities
❌ Transfer full context when not needed

References

"Steering Agents: Improving Instruction Fidelity" (Dr. Arsanjani)
"Multi-Agent Orchestration Patterns" (Microsoft Azure)
"Adversarial Testing for LLM Agents" (Various papers)
"Cost Optimization for Multi-Model Systems" (Industry best practices)

ナビゲーション

Skillsとは？

リンク

Hierarchical Multi-Agent Architecture

Hierarchical Multi-Agent Architecture

Critical: Stacked Markdowns for Sub-Agents

For Builder → Planner Sub-Agent:

For Verifier → Adversary Sub-Agent:

The Pattern

Overview

Agent Hierarchy

Model Selection Strategy

Why Different Models?

Benefits

Implementation: Builder → Planner Sub-Agent

Builder Workflow with Planning

Builder Configuration

Builder-Planner Interaction

Planner Sub-Agent Prompt

Output Format

Rules

DON'T

Verifier Configuration

Adversary Sub-Agent Prompt

Your Mission

Security Analysis

Logic Analysis

Edge Cases

Assumption Testing

Performance

Output Format

Rules

Example Finding

DON'T

Configuration

Cross-Model Context Sharing

Advanced: Chained Verification

Configuration

Cost Optimization

Model Selection Logic

Caching Strategies

Monitoring Multi-Agent System

Metrics to Track

Tracing

Example: Complete Workflow

Best Practices

DO:

DON'T:

References

関連スキル(🤖 AI・機械学習)