name: pokemon-agent description: "Play turn-based RPGs autonomously via Game Boy emulation. Use when the user asks to 'play pokemon', 'emulate a game boy game', 'automate pokemon battles', 'grind pokemon', 'run a pokemon nuzlocke', 'play an RPG for me', or mentions headless emulation, PyBoy, or turn-based game automation." version: 0.1.0 metadata: { "openclaw": { "emoji": "🎮", "requires": { "bins": ["python3"], "env": [] }, "install": [{ "id": "pip", "kind": "node", "label": "Install PyBoy + dependencies (pip)" }] } }
Pokemon Agent
Autonomous turn-based RPG player using headless Game Boy emulation via PyBoy.
Overview
This skill runs a Game Boy / Game Boy Color ROM headlessly using PyBoy's Python API. The agent reads game state from emulator memory, makes strategic decisions (via LLM or heuristics), and sends button inputs back. No display server required — runs fully headless inside a terminal, container, or stereOS VM.
Requirements
- Python 3.10+
- PyBoy (
pip install pyboy) - A legally obtained ROM file (
.gbor.gbc) - Optional:
Pillowfor screenshot capture and frame export
Setup
Run the install script to set up the Python environment:
cd {baseDir}
bash scripts/install.sh
Place your ROM file in the skill directory or provide an absolute path when starting.
Usage
Start a game session
Play Pokemon Red for me
Start a new pokemon playthrough
Grind my team on Route 3
Battle automation
Auto-battle wild encounters
Use the best move against the current opponent
Grind until my starter reaches level 16
Navigation
Walk to Pewter City
Navigate to the next gym
Find and heal at the nearest Pokemon Center
How It Works
Game Loop
- Boot: Launch PyBoy in headless mode (
window="null") - Read state: Extract game data from known memory addresses
- Decide: Choose action based on current context (battle, overworld, menu)
- Act: Send button inputs to the emulator
- Advance: Tick the emulator forward, wait for state change
- Repeat
Memory Map (Pokemon Red/Blue)
The agent reads game state from these memory addresses:
| Address | Data |
|---|---|
0xD057 | Battle type (0 = none, 1 = wild, 2 = trainer) |
0xCFE6 | Enemy current HP |
0xCFE7 | Enemy max HP |
0xD015 | Player lead Pokemon current HP (high byte) |
0xD016 | Player lead Pokemon current HP (low byte) |
0xD014 | Player lead Pokemon level |
0xD163 | Number of Pokemon in party |
0xD01C | Player Pokemon move 1 ID |
0xD01D | Player Pokemon move 2 ID |
0xD01E | Player Pokemon move 3 ID |
0xD01F | Player Pokemon move 4 ID |
0xD02C | Player Pokemon move 1 PP |
0xD02D | Player Pokemon move 2 PP |
0xD35E | Current map ID |
0xD361 | Player X position |
0xD362 | Player Y position |
0xD31D | Number of badges |
0xFF44 | Current scanline (use to detect vblank for frame sync) |
Battle Strategy
When in battle (0xD057 != 0):
- Read enemy HP and player HP
- If player HP < 20% max → use best healing item
- If all moves have PP → pick highest-power move with type advantage
- If a move is super effective → always prefer it
- If no PP remaining → use Struggle (auto-selected)
- If player fainted → switch to next alive Pokemon
- If all fainted → navigate to Pokemon Center after whiteout
Overworld Navigation
When not in battle (0xD057 == 0):
- Read current map ID and position
- Follow a predefined route plan (stored in
references/routes.json) - Move toward objective using cardinal directions
- If grass tile → expect random encounter, ensure party is healthy
- If at Pokemon Center → heal if any party member below 50%
- If at objective → execute next story beat
Input Mapping
Send inputs via PyBoy's button API:
# Button press: hold for N frames then release
def press_button(pyboy, button, frames=10):
pyboy.button(button) # press
for _ in range(frames):
pyboy.tick()
pyboy.button_release(button) # release
pyboy.tick()
# Available buttons: "a", "b", "start", "select", "up", "down", "left", "right"
Menu Navigation
Menus require sequenced button presses with frame delays:
- Select FIGHT in battle: Press
"a"→ wait 30 frames → cursor is on FIGHT - Move cursor down: Press
"down"→ wait 10 frames - Confirm selection: Press
"a"→ wait 30 frames - Cancel / go back: Press
"b"→ wait 20 frames - Open start menu: Press
"start"→ wait 30 frames
Screenshot Capture
To capture the current frame for LLM vision analysis:
from PIL import Image
screen = pyboy.screen.ndarray # numpy array of current frame
img = Image.fromarray(screen)
img.save("current_frame.png")
Verification
Before deploying or after making changes, verify the agent works end-to-end:
# Unit tests (205 tests, 100% coverage required)
uv run pytest
# Live integration: run 1000 turns, confirm Pokemon selected
PYTHONPATH=scripts .venv/bin/python scripts/agent.py "rom/Pokemon - Red Version (USA, Europe) (SGB Enhanced).gb" --max-turns 1000
What to look for in live output:
Party: 1appears around turn 100 — starter Pokemon selectedBattle ended. Total wins: 1— rival battle wonMAP CHANGE | 40 -> 0— exited Oak's Lab to Pallet Town- Output streams in real-time with
[HH:MM:SS]timestamps
A healthy run navigates: Red's bedroom (map 38) → house 1F (map 37) → Pallet Town (map 0) → Oak trigger → Oak's Lab (map 40) → pick starter → fight rival → exit.
Evolution and parameter tuning
# Run 10 parameter variants in parallel and rank by fitness
uv run scripts/run_10_agents.py "rom/Pokemon - Red Version (USA, Europe) (SGB Enhanced).gb"
# Run evolution harness (mutate + evaluate over generations)
uv run scripts/evolve.py "rom/Pokemon - Red Version (USA, Europe) (SGB Enhanced).gb" --generations 5 --max-turns 1000
Running on stereOS
This skill is designed to run inside a stereOS VM via Master Blaster. See jcard.toml for the VM configuration.
mb up # boot VM, install deps, start agent through Tapes
mb attach # watch the agent play
Shared Mount Permissions
The [[shared]] mount maps the host repo to /workspace inside the VM. Host files retain their original ownership (UID 501 on macOS), but the VM runs as admin (UID 1000). Output directories (frames/, pokedex/, .tapes/) need world-writable permissions so the agent can write data that persists back to the host. The install script handles this automatically with chmod a+rwx.
Tapes + Kafka Telemetry
Tapes proxies all LLM API calls and publishes tapes.node.v1 events to Kafka. The agent just sets ANTHROPIC_API_BASE to the Tapes proxy — zero code changes.
Agent → Tapes proxy (port 8080) → Kafka (agent.telemetry.raw)
↓
Flink SQL jobs (anomaly detection)
↓
Kafka (agent.telemetry.alerts)
Start the full local stack:
docker compose up -d # Kafka + Zookeeper + Tapes proxy + Flink + consumers
Inspect sessions:
tapes deck # Terminal UI for session exploration
tapes search "battle" # Search session turns
tapes checkout <hash> # Restore a previous conversation state
Observational Memory
Long agent runs hit context compaction — when the context window fills up, older messages are compressed and cache prefixes are destroyed. Tapes solves this by storing the full conversation in .tapes/tapes.sqlite regardless of what happens to the live context.
The observational memory system reads Tapes data and distills it into a lightweight observations file that the agent can load at session start. This gives the agent durable memory across compaction boundaries and between sessions.
Session start: Read .tapes/memory/observations.md to recall what happened in previous sessions — errors hit, files created, progress made. This is cheap to load and keeps the agent from repeating mistakes or rediscovering things it already learned.
Session end: Run the observer to extract observations from the current session into the memory file.
# Check observations from past sessions before starting
cat .tapes/memory/observations.md
# After a session, distill new observations
python3 scripts/observe_cli.py
# Preview what would be extracted without writing
python3 scripts/observe_cli.py --dry-run
Observations are tagged by priority:
[important]— errors, crashes, bugs, security issues[possible]— tests added, refactors, dependency updates[informational]— session goals, token usage, general context
For long speed runs, the pattern is:
- Load observations at session start for continuity
- Play the game, making decisions informed by past sessions
- Run the observer after the session to capture what happened
- Next session picks up where this one left off, even if context was compacted
File Structure
pokemon-agent/
├── SKILL.md # This file
├── jcard.toml # stereOS VM config
├── docker-compose.yml # Kafka + Flink + Tapes proxy stack
├── .tapes/ # Tapes telemetry DB + config (gitignored)
│ └── memory/ # Observational memory output
├── scripts/
│ ├── install.sh # Setup script (installs PyBoy + Tapes)
│ ├── agent.py # Main agent loop (1000 lines)
│ ├── memory_reader.py # Memory address definitions
│ ├── memory_file.py # Agent memory management
│ ├── pathfinding.py # A* pathfinding + collision maps
│ ├── evolve.py # AlphaEvolve parameter evolution
│ ├── run_10_agents.py # Parallel multi-agent evaluation
│ ├── tape_reader.py # Tapes SQLite reader
│ ├── observer.py # Observation extraction heuristics
│ └── observe_cli.py # Observer CLI
├── docker/
│ ├── tapes-proxy/ # Tapes proxy Dockerfile
│ ├── telemetry-consumer/ # Raw telemetry event consumer
│ ├── alerts-consumer/ # Flink anomaly alert consumer
│ └── flink-sql/ # Flink SQL anomaly detection jobs
├── tests/ # 100% coverage test suite (205 tests)
└── references/
├── routes.json # Overworld waypoints by map ID
└── type_chart.json # Pokemon type effectiveness
Self-Healing Navigation
When a run fails — high stuck counts, low battle wins, agent trapped in a loop — fix the waypoints in references/routes.json before re-tuning parameters. The AlphaEvolve harness optimizes numeric knobs but cannot fix a path that walks into an impassable ledge.
Diagnosis workflow
After a failed run, read the pokedex log (pokedex/logN.md) and check:
- Stuck events vs battles won — a ratio above 100:1 means a navigation dead-end, not a parameter problem
- Final position — if the agent ended on the same map it entered, it never progressed
- Map changes — fewer than 5 means the agent is trapped in one area
- Stuck position cluster — grep the log for
STUCKlines; if they all cluster around the same y-coordinate, there's a physical obstacle (ledge, tree, NPC) blocking the path
Fix sequence
- Identify the obstacle — check where stuck events cluster (e.g., Route 1 y=24 is a one-way ledge)
- Update
references/routes.json— reroute waypoints around the obstacle, not through it - Use
"loop": truefor grind zones — when the goal is battles (not progression), loop waypoints keep the agent farming encounters in a known-good grass area instead of advancing into obstacles - Re-run immediately — don't burn turns tuning parameters for a broken path; fix the path first, then evolve
Route design principles
- Grind routes loop: set
"loop": trueand keep waypoints in tall grass (e.g., Route 1 south at y=29-33) - Progression routes go one way: no loop flag, waypoints lead to the next city
- Ledges are one-way down: never route the agent northward through a known ledge; go around or stay south
- Keep waypoints close together: big jumps between waypoints (e.g., y=27 to y=21) hide obstacles the agent can't see
When to use AlphaEvolve vs waypoint fixes
| Symptom | Fix |
|---|---|
| High stuck count, agent trapped at one position | Fix waypoints in routes.json |
| Agent progresses but loses battles | Tune battle params via evolve.py |
| Agent navigates but slowly | Tune navigation params (stuck_threshold, bt_*) |
| Agent oscillates between two maps | Add "loop": true or adjust door cooldown |
Example: Route 1 grind zone
The south grass on Route 1 (y=29-33) has reliable wild encounters. A looping route keeps the agent here:
"12": {
"name": "Route 1",
"loop": true,
"waypoints": [
{"x": 5, "y": 33, "note": "Enter from Pallet Town — south grass zone"},
{"x": 5, "y": 29, "note": "Walk north into tall grass"},
{"x": 9, "y": 31, "note": "Sweep right through grass"},
{"x": 5, "y": 33, "note": "Loop back south"}
]
}
When the agent faints, it respawns in Pallet Town, walks north, and re-enters the grind loop automatically.
Limitations
- ROM not included. You must supply your own legally obtained ROM.
- Memory addresses are specific to Pokemon Red/Blue (US). Other games or regions require adjusted offsets.
- Real-time games (action RPGs) are not supported — this is for turn-based only.
- PyBoy supports Game Boy and Game Boy Color. GBA requires a different emulator (mGBA with Python bindings).
Extending
To support other turn-based RPGs:
- Map the game's memory layout (use BGB debugger or similar)
- Create a new memory reader module in
scripts/ - Adjust the battle strategy logic for the game's combat system
- Update
routes.jsonwith the game's map structure