QAgent - AGENTS.md

AI Agent Guide: This file is the primary reference for AI coding agents working on QAgent. Read this before starting any work.

Project Overview

QAgent is a self-healing QA agent that automatically tests web applications, identifies bugs, applies fixes, and verifies the fixes – all without human intervention. It creates a closed-loop system that iterates until all tests pass.

The QAgent Loop

┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐
│  TESTER  │───▶│  TRIAGE  │───▶│  FIXER   │───▶│ VERIFIER │
│  Agent   │    │  Agent   │    │  Agent   │    │  Agent   │
└──────────┘    └──────────┘    └──────────┘    └──────────┘
     │                                               │
     │           ┌──────────────┐                    │
     │           │    Redis     │◀───────────────────┘
     │           │ (Knowledge   │
     │           │    Base)     │
     │           └──────────────┘
     │                  │
     ▼                  ▼
┌─────────────────────────────────────────────────────────┐
│              W&B Weave (Observability)                  │
└─────────────────────────────────────────────────────────┘

Tester Agent runs E2E tests using Browserbase + Stagehand
Triage Agent diagnoses failures and queries the knowledge base
Fixer Agent generates code patches using LLM + past fix patterns
Verifier Agent applies patches, deploys via Vercel, and re-runs tests
Knowledge Base (Redis) stores successful fixes for future reference

Technology Stack

Layer	Technology	Purpose
Frontend	Next.js 14 (App Router), React 18, TypeScript	Demo app and dashboard UI
Styling	Tailwind CSS, Radix UI	Component styling and UI primitives
Browser Automation	Browserbase + Stagehand	AI-powered E2E testing
Deployment	Vercel	Instant deployment after fixes
Vector Memory	Redis Stack (with vector search)	Store failure traces and enable semantic lookup
Observability	W&B Weave	Trace agent runs, log metrics, evaluate improvements
Dashboard	Marimo	Interactive analytics and live visualization
LLM	OpenAI / Google Gemini / Anthropic	Patch generation and diagnosis
Authentication	GitHub OAuth	Dashboard access control
Mobile	React Native (Expo)	Mobile companion app

Project Structure

QAgent/
├── .claude/
│   └── skills/               # Domain-specific knowledge modules
│       ├── browserbase-stagehand/   # Browser automation patterns
│       ├── redis-vectorstore/       # Vector embeddings, semantic search
│       ├── vercel-deployment/       # Programmatic deployments
│       ├── wandb-weave/             # Tracing and evaluation
│       ├── google-adk/              # ADK/A2A integration patterns
│       ├── marimo-dashboards/       # Reactive notebooks
│       └── qagent-agents/           # Agent implementation patterns
├── agents/                   # Agent implementations
│   ├── analyzer/            # Run analysis and summarization
│   ├── crawler/             # Autonomous crawl and discovery flows
│   ├── tester/              # E2E test execution with Stagehand
│   ├── triage/              # Failure diagnosis and root cause analysis
│   ├── fixer/               # LLM-powered patch generation
│   ├── verifier/            # Patch application and deployment
│   ├── orchestrator/        # Workflow coordination (main entry point)
│   └── adk/                 # ADK workflow & agents (planned integration)
├── app/                     # Next.js App Router
│   ├── api/                 # API routes (auth, runs, patches, tests, webhooks)
│   ├── dashboard/           # Dashboard UI pages
│   └── layout.tsx           # App shell and metadata
├── components/              # React components
│   ├── dashboard/           # Dashboard-specific components
│   ├── diagnostics/         # Diagnostic views
│   ├── monitoring/          # Monitoring components
│   ├── onboarding/          # First-run guidance and setup
│   ├── patches/             # Patch management UI
│   ├── runs/                # Run tracking components
│   ├── ui/                  # Shared UI components (shadcn/ui style)
│   └── voice/               # Voice interface components
├── lib/                     # Shared libraries
│   ├── auth/                # Authentication utilities (GitHub OAuth)
│   ├── browserbase/         # Browser automation utilities
│   ├── dashboard/           # Dashboard data helpers
│   ├── git/                 # Local git workflow helpers
│   ├── github/              # GitHub API integration
│   ├── hooks/               # React hooks
│   ├── notifications/       # Toasts and notification helpers
│   ├── providers/           # React providers
│   ├── queue/               # Job queue processing
│   ├── redis/               # Redis vector store client
│   ├── redteam/             # Adversarial testing suite
│   ├── tracetriage/         # Trace analysis and self-improvement
│   ├── utils/               # Shared utilities
│   └── weave/               # W&B Weave logging and tracing
├── mobile/                  # React Native mobile app
├── dashboard/               # Marimo analytics dashboard (app.py)
├── docs/                    # Documentation
│   ├── PRD.md              # Product Requirements Document
│   ├── DESIGN.md           # System design and data structures
│   ├── ARCHITECTURE.md     # Architecture Decision Records
│   ├── DEMO_SCRIPT.md      # 3-minute demo script
│   └── SPONSOR_INTEGRATIONS.md  # Sponsor integration details
├── prompts/                 # Agent prompts
├── scripts/                 # Build/deploy helper scripts
├── tests/
│   ├── e2e/                # E2E test specs and runner
│   └── unit/               # Vitest unit tests
├── middleware.ts           # Next.js auth middleware
└── .env.example            # Environment variable template

Build, Test, and Development Commands

# Install dependencies
pnpm install

# Development server (demo app)
pnpm dev                    # Starts Next.js dev server on localhost:3000

# Agent workflow
pnpm run agent              # Start the QAgent orchestrator

# Testing
pnpm test                   # Run unit tests with Vitest
pnpm run test:e2e           # Execute E2E flows via tests/e2e/runner.ts

# Code quality
pnpm lint                   # Run ESLint + TypeScript type-check
pnpm format                 # Format with Prettier
pnpm format:check           # Check formatting without modifying files

# Production
pnpm build                  # Build for production
pnpm start                  # Start production server

# Dashboard
pnpm dashboard              # Launch Marimo dashboard

# Redis
pnpm redis:init             # Initialize Redis schema

Configuration Files

File	Purpose
`package.json`	pnpm workspace configuration, scripts, dependencies
`tsconfig.json`	TypeScript compiler options (strict mode, path aliases)
`next.config.js`	Next.js configuration (React StrictMode)
`tailwind.config.js`	Tailwind CSS theme, colors, animations
`vitest.config.ts`	Vitest test configuration
`.eslintrc.json`	ESLint rules (extends next/core-web-vitals)
`.prettierrc`	Prettier formatting rules
`middleware.ts`	Next.js auth middleware (GitHub OAuth session validation)

Coding Style & Naming Conventions

Formatter: Prettier is the source of truth
- tabWidth: 2
- singleQuote: true
- semi: true
- trailingComma: es5
- printWidth: 100
TypeScript: Strict mode enabled
- Avoid any unless absolutely justified
- Use explicit return types for public methods
- Prefer interfaces over types for object shapes
Naming:
- PascalCase for components, classes, interfaces
- camelCase for variables, functions, methods
- UPPER_SNAKE_CASE for constants
- kebab-case for file names
File Organization:
- One class per file for agents
- Co-locate related types in lib/types.ts
- Use path aliases (@/) for imports

Testing Guidelines

Unit Tests

Location: tests/unit/
Framework: Vitest
Pattern: *.test.ts
Run: pnpm test
Coverage: Configured for agents/**/*.ts and lib/**/*.ts

E2E Tests

Location: tests/e2e/
Test specs: tests/e2e/specs.ts
Runner: tests/e2e/runner.ts
Run: pnpm run test:e2e
Framework: Stagehand (AI-powered browser automation)

Environment Variables

Copy .env.example to .env.local and fill in required values:

Required for Core Functionality

Variable	Description
`BROWSERBASE_API_KEY`	Browserbase API key for browser automation
`BROWSERBASE_PROJECT_ID`	Browserbase project identifier
`OPENAI_API_KEY`	OpenAI API key for LLM patch generation
`REDIS_URL`	Redis connection string (local or Redis Cloud)
`VERCEL_TOKEN`	Vercel API token for deployments
`VERCEL_PROJECT_ID`	Vercel project identifier
`WANDB_API_KEY`	Weights & Biases API key for Weave

Required for Dashboard

Variable	Description
`GITHUB_CLIENT_ID`	GitHub OAuth App client ID
`GITHUB_CLIENT_SECRET`	GitHub OAuth App client secret
`SESSION_SECRET`	Session encryption key (generate with `openssl rand -hex 32`)

Optional

Variable	Description
`ANTHROPIC_API_KEY`	Anthropic API key (backup LLM)
`GOOGLE_API_KEY`	Google API key for Gemini models
`GITHUB_TOKEN`	GitHub token for code operations
`DATABASE_URL`	PostgreSQL connection string
`SLACK_BOT_TOKEN`	Slack notifications
`LINEAR_API_KEY`	Linear issue tracking

Security Note: Never commit .env.local to version control.

Agent Architecture

Tester Agent (`agents/tester/`)

Executes E2E tests using Stagehand + Browserbase
Captures screenshots, DOM snapshots, console logs on failure
Generates structured FailureReport objects
Instrumented with W&B Weave for observability

Triage Agent (`agents/triage/`)

Classifies failures: UI_BUG, BACKEND_ERROR, DATA_ERROR, TEST_FLAKY, UNKNOWN
Localizes bugs to file/line using error patterns + LLM
Queries Redis for similar past issues
Generates DiagnosisReport with root cause analysis

Fixer Agent (`agents/fixer/`)

Generates minimal, targeted code patches
Uses LLM with few-shot examples from knowledge base
Validates patches for safety and syntax
Produces unified diff format

Verifier Agent (`agents/verifier/`)

Applies patches to filesystem
Creates backups and handles rollback
Validates TypeScript/JSX syntax
Deploys to Vercel and re-runs tests
Records successful fixes in Redis

Orchestrator (`agents/orchestrator/`)

Coordinates the full QAgent loop
Handles iteration limits and failure recovery
Logs metrics to Weave
Entry point: pnpm run agent

Development Workflow (Ralph Loop)

Follow this iterative workflow for development:

Read - Load AGENTS.md, CLAUDE.md, GEMINI.md, and relevant skills
Analyze - Understand current phase requirements
Plan - Break down into small, testable increments
Execute - Implement one increment at a time
Validate - Test, lint, verify acceptance criteria
Loop - Update documentation as needed, commit, and return to step 1

Security & Safety Guidelines

Always

Keep secrets out of version control
Validate all patches for dangerous patterns (eval, exec, rm -rf)
Use parameterized queries for database access
Sanitize user inputs in RedTeam tests
Verify GitHub webhook signatures

Never

Hardcode secrets or credentials
Deploy untested patches to production
Skip Redis lookup results when available
Ignore Weave logging for agent runs
Commit broken code

Key Files for AI Agents

File	Purpose
`AGENTS.md`	Primary repo guide for coding agents
`CLAUDE.md`	Detailed tech stack, phase roadmap, always/never rules
`GEMINI.md`	Compact project context for Gemini CLI
`lib/types.ts`	All TypeScript interfaces and types
`prompts/ralph-loop.md`	Development workflow prompts
`.claude/skills/`	Domain-specific implementation guides

Dependencies

Production

next - Next.js framework
@browserbasehq/stagehand - AI browser automation
redis - Redis client with vector search
weave - W&B Weave observability
openai - OpenAI SDK
@radix-ui/* - Headless UI components
framer-motion - Animations
recharts - Charts for dashboard
lucide-react - Icons

Development

vitest - Unit testing
typescript - Type checking
eslint - Linting
prettier - Formatting
tsx - TypeScript execution

Troubleshooting

Common Issues

Stagehand initialization fails

Verify BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID
Check Browserbase dashboard for session limits

Redis connection errors

For local: ensure Redis Stack is running (redis-server)
For cloud: verify REDIS_URL format

TypeScript errors after patch

Fixer Agent may generate type-incorrect code
Type errors are allowed; syntax errors are blocked
Check pnpm lint output

Vercel deployment fails

Verify VERCEL_TOKEN and VERCEL_PROJECT_ID
Check git working directory is clean

References

QAgent Paper - Five-step agentic patching framework
Stagehand Docs - AI-powered browser automation
Browserbase Docs - Cloud browser infrastructure
Redis Vector Search - Semantic similarity
W&B Weave - LLM observability
Marimo - Reactive Python notebooks

Last updated: March 2026

ナビゲーション

Skillsとは？

リンク

QAgent - AGENTS.md

QAgent - AGENTS.md

Project Overview

The QAgent Loop

Technology Stack

Project Structure

Build, Test, and Development Commands

Configuration Files

Coding Style & Naming Conventions

Testing Guidelines

Unit Tests

E2E Tests

Environment Variables

Required for Core Functionality

Required for Dashboard

Optional

Agent Architecture

Tester Agent (`agents/tester/`)

Triage Agent (`agents/triage/`)

Fixer Agent (`agents/fixer/`)

Verifier Agent (`agents/verifier/`)

Orchestrator (`agents/orchestrator/`)

Development Workflow (Ralph Loop)

Security & Safety Guidelines

Always

Never

Key Files for AI Agents

Dependencies

Production

Development

Troubleshooting

Common Issues

References

関連スキル(🔧 開発ツール)

ナビゲーション

Skillsとは？

リンク

QAgent - AGENTS.md

QAgent - AGENTS.md

Project Overview

The QAgent Loop

Technology Stack

Project Structure

Build, Test, and Development Commands

Configuration Files

Coding Style & Naming Conventions

Testing Guidelines

Unit Tests

E2E Tests

Environment Variables

Required for Core Functionality

Required for Dashboard

Optional

Agent Architecture

Tester Agent (agents/tester/)

Triage Agent (agents/triage/)

Fixer Agent (agents/fixer/)

Verifier Agent (agents/verifier/)

Orchestrator (agents/orchestrator/)

Development Workflow (Ralph Loop)

Security & Safety Guidelines

Always

Never

Key Files for AI Agents

Dependencies

Production

Development

Troubleshooting

Common Issues

References

関連スキル(🔧 開発ツール)

Tester Agent (`agents/tester/`)

Triage Agent (`agents/triage/`)

Fixer Agent (`agents/fixer/`)

Verifier Agent (`agents/verifier/`)

Orchestrator (`agents/orchestrator/`)