name: long-task-harness description: Maintains continuity across long-running tasks that span multiple agent sessions. Use when starting or resuming a complex project that spans multiple sessions, or for tasks with many discrete features requiring iterative development.
Long Task Harness
Structured workflows for maintaining continuity across agent sessions. Addresses the "shift change" problem where context is lost between sessions.
First-Time Setup
On first invocation, check if .long-task-harness/long-task-progress.md exists in the project.
If it doesn't exist: Initialize
python3 <SKILL_PATH>/scripts/init_harness.py
This creates a .long-task-harness/ directory containing:
long-task-progress.md- Session history and notesfeatures.json- Feature tracking with pass/fail statusinit.sh- Environment setup script (optional)
Then: Ensure persistent invocation
Check if AGENTS.md (for Codex/Droid/Cursor/Pi) or CLAUDE.md (for Claude Code) contains the harness snippet.
If not, prompt the user:
"I notice this project doesn't have long-task-harness configured for automatic invocation. Would you like me to add the following to [AGENTS.md / CLAUDE.md]?
## Multi-Session Development This project uses long-task-harness for session continuity. At session start or after context reset, invoke the skill at: <SKILL_PATH>Where would you like me to add this?"
Use the actual path where you loaded this skill from for <SKILL_PATH>.
Optional: Claude Code Hooks
For Claude Code users who want automatic enforcement:
"Would you like to install Claude Code hooks? These will:
- Remind to invoke this skill on session start
- Warn before git commits if
.long-task-harness/long-task-progress.mdnot staged"
If yes: python3 <SKILL_PATH>/scripts/claude_code_install_hooks.py
Note: These hooks are for Claude Code only. Other agents should use AGENTS.md instructions.
Optional: Git Pre-Commit Hook (Unsupported Agents)
If the current agent does not support native hooks (e.g., Codex, Cursor, or other CLI agents), offer an optional repo-local git pre-commit hook. Warn that it affects all commits in the current repository clone.
If yes: python3 <SKILL_PATH>/scripts/precommit_install_hook.py
Session Startup Protocol
At the start of each session:
python3 <SKILL_PATH>/scripts/read_progress.py # Last 3 sessions
python3 <SKILL_PATH>/scripts/read_features.py # Incomplete features
git log --oneline -10
Then continue from "Next Steps" in the latest session entry.
During Work
- Work on one feature at a time
- Commit frequently with descriptive messages
- Update
.long-task-harness/features.jsonwhen features pass tests - Update
.long-task-harness/long-task-progress.mdbefore ending session
Session Entry Format
### Session N | YYYY-MM-DD | Commits: abc123..def456
#### Goal
[One-liner]
#### Accomplished
- [x] Task done
- [ ] Task carried forward
#### Decisions
- **[D1]** Decision made - reasoning
#### Surprises
- **[S1]** Expected X but found Y - implication
#### Next Steps
1. Priority task
Why Log Surprises?
Surprises indicate model uncertainty and contain information-dense context. If something surprised you, it could trip up the next session (or a different agent). Examples:
- [S1] Expected
auth.pyto handle OAuth, but it only does API keys. OAuth is inoauth_provider.py. - [S2] Test suite requires Docker running - not documented in README.
- [S3] Config file is gitignored but required - must copy from
config.example.yaml.
This section is optional but valuable for complex or unfamiliar codebases.
Before Ending Session
- Update
.long-task-harness/long-task-progress.mdwith session notes - Commit all changes including progress docs
- Verify tests pass
Critical Rules
- Never edit tests to make them pass - fix implementation
- Never mark features passing without testing
- Always update progress docs before ending
- Commit frequently
Scripts
| Script | Purpose |
|---|---|
init_harness.py | Initialize project with tracking files in .long-task-harness/ |
claude_code_install_hooks.py | Install/uninstall Claude Code hooks (prompt-based, triggers on git add) |
pi_install_hooks.py | Install Pi agent hooks (tool_result modification) |
precommit_install_hook.py | Install repo-local git pre-commit hook (for Codex, Cursor, etc.) |
precommit_check.py | Shared pre-commit check logic (warns if progress not staged) |
read_progress.py | Read sessions (--list, --session N, -n 5) |
read_features.py | Read features (--feature ID, --json) |
session_metadata.py | Generate git metadata for session entries |
status_line.py | Show session status (--full, --json) |
check_rules.py | Declarative rules for catching issues |
git_add.py | Git add wrapper with rule checking |
Additional Features
Status Line
Quick session overview:
python3 <SKILL_PATH>/scripts/status_line.py # Compact: S5 | F:3/5 [auth-001] | main (U:2)
python3 <SKILL_PATH>/scripts/status_line.py --full # Detailed multi-line
python3 <SKILL_PATH>/scripts/status_line.py --json # JSON output
Declarative Rules
Define rules in .long-task-harness/rules/*.md to catch common issues before they're committed:
---
name: warn-console-log
enabled: true
event: file
file_pattern: \\.tsx?$
pattern: console\\.log\\(
action: warn
---
🐛 **Debug code detected**
Remove console.log before committing.
Check operations:
python3 <SKILL_PATH>/scripts/check_rules.py bash "rm -rf /tmp"
python3 <SKILL_PATH>/scripts/check_rules.py file src/app.ts "console.log('test')"
python3 <SKILL_PATH>/scripts/check_rules.py commit
python3 <SKILL_PATH>/scripts/check_rules.py list
python3 <SKILL_PATH>/scripts/check_rules.py init # Create default rules
Events: bash, file, stage, commit, any
Actions: warn (continue), block (exit 1)
Git Add with Rule Checking
Use instead of raw git add to catch issues at staging time:
python3 <SKILL_PATH>/scripts/git_add.py file1.py file2.ts # Stage specific files
python3 <SKILL_PATH>/scripts/git_add.py . # Stage all
python3 <SKILL_PATH>/scripts/git_add.py --check-only . # Preview without staging
python3 <SKILL_PATH>/scripts/git_add.py --force . # Stage despite blockers
This checks file and stage event rules before staging, warns about missing progress updates.
History Research (10+ Sessions)
For long projects, use subagents as scouts to find relevant history:
Research the history of [feature/file] in this project.
Return POINTERS (session numbers, file paths, decision refs) - not summaries.
Then read only the specific sessions identified.