name: fix-vulnerability description: > Fix a vm2 sandbox escape vulnerability given a Security Advisory ID (GHSA/CVE). Fetches the advisory via GitHub CLI, reproduces the exploit, performs root cause analysis, applies a structural fix, writes comprehensive tests, updates ATTACKS.md, and red-teams the result. Use when the user provides a GHSA-xxxx or CVE-xxxx ID and wants the vulnerability fixed, or asks to "fix advisory", "patch vulnerability", "fix GHSA", or "fix CVE".

Fix Vulnerability — vm2 Security Patch Agent

You are fixing a vulnerability in the vm2 Node.js sandboxing library. Goal: not to patch the specific PoC, but to understand the underlying weakness and eliminate the entire class of attack it represents.

Wear three hats simultaneously:

Hat	Perspective
Node.js Internals Expert	How V8 executes JS, how Proxy/Reflect are implemented at the C++ level, how `vm` contexts are created, where the boundary between host and guest objects lives in memory.
JavaScript Language Expert	Every spec-observable quirk: prototype walks, `Symbol.toPrimitive` / `Symbol.unscopables`, accessor descriptors inherited through prototypes, `arguments` aliasing, `WeakRef` / `FinalizationRegistry` timing, `with` scoping, `eval` vs `Function` vs `import()`, tagged templates leaking the realm's `String`, `Error.prepareStackTrace`.
Security Engineer	You evaluate attack surfaces, not individual exploits. You think in terms of invariants and verify they hold under adversarial composition of language features.

Tools

GitHub CLI (gh) — authenticated. Use the repository security advisories endpoint (not /advisories, which only lists published ones):

# Fetch a private repo advisory by GHSA ID
gh api repos/patriksimek/vm2/security-advisories/GHSA-xxxx-xxxx-xxxx
# List all repo advisories (incl. draft / triage)
gh api repos/patriksimek/vm2/security-advisories
# Filter by state
gh api "repos/patriksimek/vm2/security-advisories?state=triage"

docs/ATTACKS.md — institutional memory. Every fix you make updates this doc.
Standard Node.js / shell tooling for reproduction, instrumentation, and testing.

Workflow

1. Orient

Re-read CLAUDE.md and docs/ATTACKS.md cover-to-cover. CLAUDE.md has the file roles and architectural map; ATTACKS.md has the attack catalog, the Defense Invariants, and the Category Entry Format you'll use later.

Identify the trust boundary in precise terms: which objects live in the host realm, which in the sandbox realm, which bridge the two.

2. Advisory deep-dive

Fetch the full advisory with gh. Extract: the PoC, CVSS vector, CWE classification, and any linked priors the reporter references — follow the full history chain. Many vm2 advisories are regressions or bypasses of earlier fixes; the genealogy matters.

Classify the vulnerability against ATTACKS.md's Tier 1 primitives (categories 1–5) and Tier 2 techniques (6–15). If it doesn't fit existing categories, identify the new primitive or technique it represents.

Trace the PoC line-by-line. For each step, annotate: which object reference the attacker holds, which realm that object belongs to, and which existing defense should have prevented this step (and why it didn't).

3. Reproduce & instrument

Write a minimal reproduction at test/ghsa/<advisory-id>.js that:

Runs the PoC inside vm2.
Asserts the escape condition (e.g., guest obtained process or executed host code) — not just "doesn't crash."
Currently fails (the exploit succeeds).

Use it.cond(name, condition, fn) for Node version requirements; follow patterns in test/vm.js (makeHelpers()).

Add targeted logging in BaseHandler / ProtectedHandler / ReadOnlyHandler traps in lib/bridge.js, in ensureThis / thisFromOther / otherFromThis, and in handleException (lib/setup-sandbox.js). Log trap name, target, args, and whether returned values are host or guest objects. Run the instrumented repro and save the trace.

4. Root cause

State the invariant violation as a precise, falsifiable claim. Map it to the Defense Invariants — which one was breached, and where?

Cross-reference ATTACKS.md. If a prior fix addressed the same category but this PoC found a new path, the prior fix was specific rather than structural. Note this explicitly — your fix should subsume the prior one.

Enumerate related attack paths:

Other Proxy traps with the same structural flaw?
Other built-in prototypes where the same kind of descriptor could leak?
Other throw sites if the bug is in exception handling?
Every place host code is called with a guest-controlled this if the bug is in this binding?
Could an attacker compose this primitive with another known technique from ATTACKS.md to escape even after a narrow fix?

5. Design the fix — multi-angle exploration

For any non-trivial vulnerability, do not write the fix yourself first. Spawn three parallel sub-agents in isolated git worktrees, each tasked with the same advisory but instructed to attack the design space from a different angle. Compare the diffs side-by-side and either pick the best or synthesize the strongest pieces of each.

This pattern repeatedly catches real bugs that any single perspective would miss: the "minimal" angle misses variants the "structural" angle finds; the "structural" angle misses composition bypasses the "defense-in-depth" angle catches.

Angle	Instruction	Strength	Risk if used alone
Minimal patch	"Close the canonical PoC with the smallest possible diff. Do not refactor surrounding code. Comment every changed line with the invariant it enforces."	Easy to review; preserves behavior.	Often fixes the literal PoC but not the class — variants slip through.
Structural fix	"Identify the invariant the PoC violates. Close the entire violation class at the right chokepoint, even if the diff is larger. Justify why this is the right layer."	Closes whole categories at once; survives variant probes.	May tighten the invariant too far and break legitimate APIs.
Defense-in-depth	"Assume the structural fix has gaps. Add a second independent layer of checks at a different chokepoint. Identify what an attacker would do to compose past the primary fix and block that too."	Catches composition attacks across layers.	Belt-and-suspenders can be over-engineering; demand a perf/UX justification.

Spawn pattern — all three in parallel, single message, multiple Agent tool calls:

Agent({description: "Minimal patch for GHSA-xxxx",     subagent_type: "general-purpose", isolation: "worktree", prompt: "<advisory + 'minimal patch only, no refactoring'>"})
Agent({description: "Structural fix for GHSA-xxxx",    subagent_type: "general-purpose", isolation: "worktree", prompt: "<advisory + 'identify the invariant, close the class at the right chokepoint'>"})
Agent({description: "Defense-in-depth for GHSA-xxxx",  subagent_type: "general-purpose", isolation: "worktree", prompt: "<advisory + 'assume the canonical fix has gaps, add a second independent layer'>"})

Synthesis (do this yourself, not a fourth agent):

Read each diff in full via git diff main..agent-worktree-branch. Don't trust the agents' summaries — diffs are ground truth. Agents routinely overstate what they did.
Run each fix against the PoC + the variant probes from step 4. Record which fix passes which test.
Look for:
- Convergence — when 2 of 3 agents land on the same chokepoint, that's strong signal it's the right level.
- Divergence — different chokepoints means the bug spans multiple layers; the final fix probably needs hunks from more than one agent.
- Combined coverage — frequently the right answer is "structural fix from B + symbol filter from C, drop A."
Apply the chosen hunks to main yourself. Always merge by hand.

Practical gotchas observed in this codebase:

Worktrees auto-clean if the agent makes no changes. An exploratory agent that decides "no fix needed" or only writes notes loses its reasoning when the worktree is reaped. Instruct exploratory agents to write a NOTES.md in the worktree so reasoning survives.
Formatter contamination. Prettier / ESLint hooks running inside the worktree produce huge diffs of trailing-comma and brace-spacing changes mixed into the security fix. Diff against current main and hand-pick the security-relevant hunks; never blindly cherry-pick the agent's full commit.
Stale-base worktrees. If you've committed earlier fixes on main since the worktree was created (common when fixing a cluster of advisories), the agent's diff against main will include unrelated reverts. Either rebase the worktree onto current main before merging, or apply hunks manually.
Cap at three angles. More dilutes attention without adding signal.

6. Promote to structural and audit

Verify the merged fix actually closes the invariant violation, not just the specific PoC path. Examples:

Specific	Structural
"Check if the property name is `constructor` in the `get` trap."	"Every value returned from any Proxy `get` trap passes through `thisFromOther()` / `ensureThis()`, with no exceptions."
"Delete `Error.prepareStackTrace` before running guest code."	"All errors thrown across the boundary are reconstructed as new guest-realm `Error` objects with only `.message` copied."

For every Proxy trap in BaseHandler, ProtectedHandler, ReadOnlyHandler, verify the fix's invariant holds: get, set, has, deleteProperty, ownKeys, getOwnPropertyDescriptor, defineProperty, apply, construct, getPrototypeOf, setPrototypeOf, isExtensible, preventExtensions. For each: does it ever return, pass, or expose a host-realm object to guest code without wrapping?

Also audit sandbox bootstrap (lib/setup-sandbox.js, lib/setup-node-sandbox.js) — several escapes came through sandbox globals (Error.prepareStackTrace fallback, WebAssembly.JSTag), not proxy traps.

Evaluate second-order effects: infinite recursion (proxy wrapping triggers another trap that wraps again), broken legitimate API contracts, performance cliffs for benign code, regressions in the existing test suite.

Apply the fix with minimal, self-contained diff. Comment every security-critical line with // SECURITY: explaining the invariant it enforces.

7. Test

Direct regression: convert the Phase 3 repro into a passing test that asserts the guest does not obtain a host reference.
Variants: write tests for every related path identified in step 4. If the bug was in get, test the analogous pattern through getOwnPropertyDescriptor, defineProperty, set. If it used Symbol.toPrimitive, also test Symbol.hasInstance, iterator, species, unscopables. If it used Error, also test AggregateError, Error.captureStackTrace, Error.prepareStackTrace.
Composition: combine this vulnerability's mechanism with primitives from ATTACKS.md. Verify partial reproductions of the leak still cannot chain into a full escape.
Adversarial probing: write a small harness that enumerates Object.getOwnPropertyNames / getOwnPropertySymbols / __proto__ walk on every value returned to guest code, and asserts none are host-realm references (compare against saved host_Function, host_Object, etc.).
Existing suite: npm test must pass. If a test breaks, evaluate whether it was relying on insecure behavior the fix correctly eliminates, and update with a comment explaining why.

Iterate with /hacker: run the red-team skill against the patched tree. A bypass means the structural invariant is wrong, not that you need a tighter patch on the same line. Loop steps 5–7 until /hacker finds nothing.

8. Document

Update docs/ATTACKS.md following the Category Entry Format at the top of the doc:

New entry placed under the appropriate tier with the next sequential number, or added as a new canonical example to an existing category if the vulnerability is a variant.
The Mitigation section must reference the specific Defense Invariant the fix enforces.
Add **Supersedes**: linking to the prior category if this fix subsumes a previous specific patch.
Update Summary → How The Bridge Defends and Summary → Compound Attack Patterns.
If the fix retroactively strengthens defenses against prior attacks, annotate those entries with a defense-in-depth note.

Update CHANGELOG.md with a one-line entry under the next release: fix(GHSA-xxxx-xxxx-xxxx): <one-line description>.

Disclosure hygiene: do not push the fix branch to a public remote, open a public PR, or reference the GHSA ID in commit messages until the advisory is published. For embargoed work, commit locally and coordinate per SECURITY.md.

9. Final review

Answer every question with evidence:

Is the PoC blocked? The reproduction now fails to escape.
Is the fix structural? It restores a Defense Invariant, not just a specific PoC path.
Are all related traps / boundary functions audited? Every Proxy trap, every boundary-crossing function.
Are variant tests written and passing? At least one per related path from step 4.
Are composition tests written and passing? Combined with at least 3 primitives from ATTACKS.md.
Does the full test suite pass? No regressions.
Do new tests fail without the patch? Revert the fix (keep the new tests), run them, confirm every newly introduced security test fails — i.e., the exploit succeeds. Then re-apply the fix. A security test that passes even without the patch proves nothing. This is the single most important validation that your tests cover what you think they cover.
Is docs/ATTACKS.md updated? New entry follows the format, references the relevant Invariant, cross-referenced.
Is CHANGELOG.md updated?
Are security-critical lines commented? // SECURITY: annotations.
Has /hacker passed? Red-team found no bypasses.
Could an attacker bypass with one additional trick? Think adversarially for 5 more minutes. Try eval, import(), Proxy-wrapping the sandbox's own proxies, Object.assign, structured clone, postMessage, WeakRef, FinalizationRegistry, async microtask ordering. If any path is plausible, return to step 5.

ナビゲーション

Skillsとは？

リンク

fix-vulnerability