name: dev-capsem-doctor description: The capsem-doctor in-VM diagnostic suite. Use when writing, running, or extending capsem-doctor tests, adding new diagnostic categories, debugging VM sandbox issues, or understanding what capsem-doctor validates. Covers all 11 test categories, how to run subsets, the conftest infrastructure, and how to add new tests.

capsem-doctor

capsem-doctor is a pytest-based diagnostic suite that runs inside the guest VM. It verifies sandbox integrity, network isolation, runtime environment, and AI agent functionality. It's the smoke test gate -- every change must pass it before shipping.

Running

just run "capsem-doctor"              # Full suite (~10s total including VM boot)
just run "capsem-doctor -k sandbox"   # Only sandbox tests
just run "capsem-doctor -k network"   # Only network tests
just run "capsem-doctor -x"           # Stop on first failure
just run "capsem-doctor -v"           # Extra verbose

Test categories (11 files)

File	What it validates
`test_sandbox.py`	Read-only rootfs, binary permissions (chmod 555), no setuid/setgid, kernel hardening (no modules, no debugfs, no IPv6, no swap, no kallsyms), process integrity (pty-agent, dnsmasq running; no systemd, sshd, cron), network isolation (dummy0, fake DNS, iptables, no real NICs)
`test_network.py`	MITM CA in system store + certifi, curl without -k works, Python urllib HTTPS, CA env vars set (SSL_CERT_FILE, REQUESTS_CA_BUNDLE, NODE_EXTRA_CA_CERTS), HTTP/80 blocked, non-443 ports blocked, direct IP blocked, multi-domain DNS faking, AI provider domains reachable
`test_environment.py`	TERM/HOME/PATH env vars correct, shell is bash, kernel version, aarch64 arch, mount points (/proc, /sys, /dev, /dev/pts), tmpfs verification
`test_runtimes.py`	Python3, Node.js, npm, pip3, git version checks; Python file I/O; Node file I/O; git init+commit workflow
`test_utilities.py`	~36 unix utilities available (coreutils, text processing, network, system tools, capsem-bench)
`test_workflows.py`	Text write/read, JSON roundtrip (Python + Node), shell pipes, large file (10MB)
`test_ai_cli.py`	claude, gemini, codex installed and executable without crashing
`test_virtiofs.py`	VirtioFS root mount, ext4 loopback upper, loop device active, workspace write/read/large file/subdir, system overlay writable, pip install works, file delete+recreate (skipped in block mode)
`test_mcp.py`	MCP gateway tool routing, domain blocking via MCP
`test_injection.py`	Security injection tests
`conftest.py`	Test infrastructure (auto-skip outside VM, `run()` helper, output dir fixture)

Infrastructure (conftest.py)

# Auto-skip if not in capsem VM (checks root + writable /root)
def pytest_ignore_collect(collection_path, config):
    if os.geteuid() != 0 or not os.access("/root", os.W_OK):
        return True

# Shell command runner
def run(cmd, timeout=10):
    return subprocess.run(cmd, shell=True, capture_output=True, text=True, timeout=timeout)

# Shared output directory: /root/tests
@pytest.fixture
def output_dir():
    return TESTS_OUTPUT_DIR

Adding a new test

Add test functions to the appropriate guest/artifacts/diagnostics/test_*.py file, or create test_<category>.py
Use from conftest import run for shell commands, output_dir fixture for temp files
Tests auto-skip outside the capsem VM (no special guards needed)
just run "capsem-doctor" picks up changes immediately (diagnostics repacked into initrd)
For rootfs-baked changes: just build-assets then just run "capsem-doctor"

Where tests live on disk

Source: guest/artifacts/diagnostics/test_*.py (in the repo)
In rootfs: /usr/local/lib/capsem-tests/test_*.py (baked by Dockerfile.rootfs)
In initrd: overrides rootfs copies via _pack-initrd (fast iteration)

Writing good diagnostic tests

Test one thing per function. Name clearly: test_readonly_rootfs, test_ca_in_certifi
Use run() for shell commands, check .returncode and .stdout/.stderr
Set reasonable timeouts (default 10s). Network tests may need longer.
Think adversarially: test what should be blocked, not just what should work
For VirtioFS tests, skip gracefully in block mode: pytest.mark.skipif

ナビゲーション

Skillsとは？

リンク

dev-capsem-doctor