name: archive-grounding description: Unpack a ZIP archive, inventory its files, run the corresponding child grounding skill for each supported child file, and then write a real archive-level grounded.md. allowed-tools: Bash, Read, Write, Edit, Grep, Glob
Archive Grounding
This skill handles ZIP archives as container inputs in the any-input -> grounding -> downstream research / summary / report pipeline.
Its job is not to directly pretend that the whole archive has already been deeply grounded after unpacking. Its job is to:
- unpack the archive,
- build a stable archive bundle,
- identify supported child files,
- run the corresponding child skill for each supported child file,
- collect those child bundles under the current archive bundle's
child_outputs/directory, - and only then write a real archive-level
grounded.md.
Position in the pipeline
This skill is for archives that package together multiple materials, such as:
- project material bundles
- meeting material bundles
- report + slides + tables + audio/video attachments
- mixed research input packages
This skill is not a replacement for the child grounding skills themselves.
document-groundingstill handles document contenttable-groundingstill handles spreadsheets / CSV tablespptx-groundingstill handles PowerPoint decksmeeting-audio-groundinghandles meeting audio inputsmeeting-video-groundinghandles meeting videosaudio_structuringremains the atomic audio transcription backend reused by the higher-level audio/video meeting entry skills
This skill acts as an orchestrator over packaged files.
Supported child file routing
The first version should route supported child files using a simple extension-based mapping:
.pdf,.docx,.md,.txt->document-grounding.xlsx,.csv->table-grounding.pptx->pptx-grounding.mp3,.wav,.m4a->meeting-audio-grounding.mp4,.mov,.mkv->meeting-video-grounding
Unsupported files may be recorded and skipped, but they must not be silently treated as grounded.
Required Workflow
When using this skill, you must follow this workflow:
-
First run the existing script entrypoint:
bash .cursor/skills/archive-grounding/scripts/run.sh <input_zip> <output_root> -
The script generates an archive bundle skeleton and inventory files such as:
extracted.mdextracted_meta.jsonmanifest.jsonrouted_items.jsonunpacked/child_outputs/
-
After the archive bundle is generated, read:
extracted.mdextracted_meta.jsonmanifest.jsonrouted_items.json
-
Then enumerate the supported child files in the archive.
-
For each supported child file, you must actually run the corresponding child skill. Do not stop after inventory / route planning.
-
For child files inside the archive, all downstream child grounding outputs must be written inside the current archive bundle's
child_outputs/directory, not into the global grounding root. -
Use the recommended child output path recorded in
routed_items.jsonwhen present. -
A child file is only considered completed if one of the following is true:
- its corresponding child bundle was actually generated under
child_outputs/, or - a concrete failure reason is explicitly recorded.
- its corresponding child bundle was actually generated under
-
Do not pretend that a child file was fully grounded merely because it was detected, inventoried, or assigned a route.
-
Only after the supported child files have been processed (or explicit failures have been recorded) may you write the archive-level
grounded.md. -
Do not create a placeholder
grounded.md. -
The task is not complete if:
- only the archive bundle skeleton exists,
manifest.json/routed_items.jsonexist but no child skills were actually run,- child bundles were written to the global grounding root instead of this archive bundle's
child_outputs/, or - archive-level
grounded.mdwas written before the child processing step was completed.
Input
bash .cursor/skills/archive-grounding/scripts/run.sh <input_zip> <output_root>
Arguments
input_zip: path to a.ziparchiveoutput_root: parent directory under which the archive bundle should be created
Example:
bash .cursor/skills/archive-grounding/scripts/run.sh \
/path/to/materials.zip \
data/grounded_notes
This will create something like:
/data/grounded_notes/archive-materials/
Output bundle contract
The script should create:
<archive_bundle>/
├─ extracted.md
├─ extracted_meta.json
├─ manifest.json
├─ routed_items.json
├─ unpacked/
├─ child_outputs/
└─ grounded.md # written later by the agent, not by the script
File roles
extracted.md
Human-readable archive overview for the agent.
It should include:
- archive overview
- detected file inventory summary
- supported vs skipped items
- routed child skills
- the rule that child outputs must live under
child_outputs/
extracted_meta.json
Global archive metadata, such as:
- source archive path
- archive id
- total file count
- supported file count
- skipped file count
- unpack directory
- generation timestamp
manifest.json
Machine-readable file inventory for unpacked contents.
Each item should include:
- relative path
- file name
- extension
- size
- detected type
- supported / unsupported
- skip reason if any
routed_items.json
Machine-readable routing plan for supported child files.
Each routed item should include:
- source relative path
- detected type
- routed skill
- status
- recommended child output path
- notes / failure reason if any
child_outputs/
Container directory for all downstream child grounding bundles generated from supported child files in this archive.
Child grounding outputs must be stored here rather than in the global grounding root.
grounded.md schema (recommended)
After the child files have been processed and the child bundles exist, the agent should write a stable archive grounding note such as:
# Archive Grounding
## 1. Archive Overview
## 2. Included Materials
## 3. Successfully Processed Child Items
## 4. Key Signals Across Materials
## 5. Skipped / Unsupported / Failed Items
## 6. Suggested Next Steps
## 7. Search Keywords
This is an archive-level grounding note, not a polished final report.
Important rule on absent file types
Do not treat the absence of a file type as a risk by default.
Examples:
- if the archive simply does not contain PPTX files, do not mark that as a missing item
- if the archive simply does not contain audio or video files, do not mark that as a missing item
- if the archive simply does not contain tables, do not mark that as a missing item
Only report:
- actual unsupported items,
- actual skipped items,
- actual failed child processing steps,
- or actual inconsistencies between the archive contents and the generated child outputs.
Quality bar
A good result means:
- the archive bundle exists
- the file inventory is correct
- every supported child file was either actually processed by its corresponding child skill or recorded with a concrete failure reason
- child bundles are stored under this archive bundle's
child_outputs/ - the archive-level
grounded.mdis written only after child processing is complete - the final archive note is useful as a stable intermediate artifact for downstream research / summary / report
A bad result means:
- the agent only unpacked and inventoried the archive
- the agent wrote archive-level
grounded.mdwithout running the child skills - child outputs were scattered into the global grounding root
- unsupported or skipped files were silently treated as processed
- placeholder
grounded.mdwas created
Failure handling
If a supported child file cannot be processed successfully, record that explicitly.
Examples:
- child skill missing
- child script failed
- child bundle path not created
- file appears corrupted
- environment dependency missing
Do not hide child-processing failures behind a fake “archive success”.