name: agent-eyes description: Visual context analyzer for AI agents. Provides screenshots, accessibility scans, DOM snapshots, and element descriptions for web pages. Use when you need to see what a web page looks like, analyze accessibility issues, inspect DOM structure, or get detailed element information. Triggers on requests like "take a screenshot", "check accessibility", "what does this page look like", "analyze the UI", "inspect this element", or any visual/UI analysis task.
Agent Eyes
Visual context analyzer for web pages. Provides AI agents with the ability to "see" web applications through screenshots, accessibility scans, DOM snapshots, and element descriptions.
Prerequisites
- Python 3.10+
uvpackage manager (recommended)- Playwright browsers installed:
playwright install chromium
Compact Mode (Token-Efficient Output)
All commands support --compact / -c flag for token-efficient output:
| Mode | Screenshot | DOM | A11y | Total Tokens |
|---|---|---|---|---|
| Standard | Base64 inline | depth=5, 20 children | Full violations | ~500K+ |
| Compact | File path only | depth=3, 10 children | Summary only | ~3-5K |
Use compact mode when context window size is a concern (which is most of the time).
# Compact context - reduces ~500K tokens to ~3-5K tokens
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --compact
# Compact screenshot - always saves to file, never returns base64
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --compact
# Compact a11y - returns summary + top N issues only
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --compact
# Compact DOM - stricter limits on depth and children
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000 --compact
Commands
All commands use uv run for automatic dependency management:
SKILL_DIR=".claude/skills/agent-eyes/scripts"
Screenshot
Capture full page or element screenshots:
# Full page screenshot (saves to .canvas/screenshots/)
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000
# Element screenshot
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --selector ".hero"
# Save to specific path
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --output ./tmp/page.png
# Get as base64 (for inline context) - NOT recommended, use --compact instead
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --base64
# RECOMMENDED: Compact mode - always saves to file, never returns base64
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 --compact
Accessibility Scan
Run axe-core accessibility analysis:
# Full page scan (WCAG 2.1 AA)
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000
# Scoped to element
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --selector "main"
# WCAG AAA level
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --level AAA
# RECOMMENDED: Compact mode - summary + top issues only (~1-2K tokens vs 100K+)
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --compact
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 --compact --max-issues 5
DOM Snapshot
Get simplified DOM tree:
# Full page DOM
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000
# Subtree only
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000 --selector ".content"
# Control depth and children
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000 --depth 3 --max-children 10
# RECOMMENDED: Compact mode - depth=3, max-children=10, text=50 chars
uv run $SKILL_DIR/agent_eyes.py dom http://localhost:3000 --compact
Describe Element
Get detailed element information (styles, bounding box, attributes):
uv run $SKILL_DIR/agent_eyes.py describe http://localhost:3000 --selector ".hero-button"
Full Context
Get comprehensive context bundle (screenshot + a11y + DOM + description):
# Full context for page
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000
# Focused on element
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --selector ".hero"
# Without screenshot (smaller output)
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --no-screenshot
# RECOMMENDED: Compact mode - file paths only, limited DOM/a11y (~3-5K tokens)
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --compact
# Compact with custom limits
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --compact \
--dom-depth 2 --max-children 5 --max-issues 5
Output Format
All commands return JSON to stdout:
{
"ok": true,
"...": "command-specific fields"
}
On error:
{
"ok": false,
"error": "Error description"
}
Compact Mode Output Examples
Compact context output (~3-5K tokens instead of ~500K):
{
"ok": true,
"url": "http://localhost:3000",
"title": "My App",
"timestamp": "2026-01-22T10-30-00-000Z",
"compact": true,
"screenshot_path": ".canvas/screenshots/2026-01-22T10-30-00-000Z.png",
"screenshot_size": 443281,
"dom": {
"tag": "body",
"children": [...]
},
"a11y_summary": {
"total_violations": 5,
"by_severity": {"critical": 1, "serious": 2, "moderate": 2, "minor": 0},
"top_issues": [
{"id": "color-contrast", "impact": "serious", "affected_count": 3}
]
}
}
Compact a11y output (~1-2K tokens instead of ~100K):
{
"ok": true,
"total_violations": 15,
"by_severity": {"critical": 2, "serious": 5, "moderate": 6, "minor": 2},
"by_category": {"color": 3, "aria": 5, "keyboard": 2},
"top_issues": [
{
"id": "color-contrast",
"impact": "serious",
"description": "Elements must have sufficient color contrast...",
"affected_count": 3,
"help_url": "https://dequeuniversity.com/rules/axe/..."
}
],
"passes": 42,
"incomplete": 3
}
Typical Agent Workflow
-
Start dev server (if not running):
npm run dev & -
Take initial screenshot to see current state:
uv run $SKILL_DIR/agent_eyes.py screenshot http://localhost:3000 -
Run accessibility scan to find issues:
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000 -
Inspect specific element for details:
uv run $SKILL_DIR/agent_eyes.py describe http://localhost:3000 --selector ".problematic-button" -
Get full context for comprehensive analysis:
uv run $SKILL_DIR/agent_eyes.py context http://localhost:3000 --selector ".hero"
Example: Analyze and Fix A11y Issues
# 1. Get accessibility violations
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000
# Output shows violations like:
# {
# "ok": true,
# "violations": [
# {
# "id": "color-contrast",
# "impact": "serious",
# "description": "Elements must have sufficient color contrast",
# "nodes": [{"html": "<button class='cta'>..."}]
# }
# ]
# }
# 2. Describe the element to understand current styles
uv run $SKILL_DIR/agent_eyes.py describe http://localhost:3000 --selector ".cta"
# 3. Make code changes to fix the contrast issue
# 4. Re-run a11y to verify fix
uv run $SKILL_DIR/agent_eyes.py a11y http://localhost:3000
Notes
- Screenshots are saved to
.canvas/screenshots/by default with ISO timestamps - The tool runs headless Chromium via Playwright
- All commands wait for
networkidlebefore capturing - DOM snapshots are simplified to reduce output size
- A11y scans use axe-core, the industry standard accessibility testing engine
Token Budget Guide
| Operation | Standard Mode | Compact Mode |
|---|---|---|
| Screenshot | ~100-470K tokens (base64) | ~50 tokens (path only) |
| DOM Snapshot | ~50-150K tokens | ~2-3K tokens |
| A11y Scan | ~50-100K tokens | ~500-1K tokens |
| Full Context | ~500K+ tokens | ~3-5K tokens |
Recommendation: Always use --compact flag unless you specifically need base64 data for inline image processing. The compact mode reduces token usage by 99% while preserving all essential information.
When to Use Each Mode
| Mode | Use Case |
|---|---|
| Standard | Debugging, when you need full HTML snippets, when feeding to vision model |
| Compact | Most agent workflows, design reviews, accessibility audits, CI/CD pipelines |