name: content-filter description: Filter and classify AI research content for relevance, topic, and author category. Use for bulk triage of raw content before detailed claim extraction.
Content Filter Skill
Filter and classify incoming content for relevance to AI research intelligence. This skill is optimized for high-throughput bulk processing.
Purpose
The content filter is the first stage of the extraction pipeline. It quickly assesses content to:
- Determine relevance to AI research discourse
- Classify by topic and content type
- Identify author category
- Filter out noise before expensive extraction
Assessment Schema
For each piece of content, produce:
1. relevance (0.0-1.0)
How relevant is this to AI research intelligence?
| Score | Meaning |
|---|---|
| 0.9-1.0 | Highly relevant - substantial claims, predictions, or hints |
| 0.7-0.9 | Clearly relevant - discusses AI capabilities, progress, or debate |
| 0.5-0.7 | Moderately relevant - tangentially about AI or tech industry |
| 0.3-0.5 | Low relevance - may contain signal but mostly noise |
| 0.0-0.3 | Not relevant - personal, off-topic, or pure promotion |
2. topic
Primary topic category:
scaling: Scaling laws, compute, training efficiencyreasoning: LLM reasoning, chain-of-thought, planningagents: AI agents, tool use, autonomysafety: AI safety, alignment, controlinterpretability: Mechanistic interpretabilitymultimodal: Vision, audio, video modelsrlhf: RLHF, preference learning, Constitutional AIbenchmarks: Evals, benchmarks, capability measurementinfrastructure: Training infra, chips, hardwarepolicy: AI policy, regulation, governancegeneral: General AI commentaryother: Doesn't fit categories
3. contentType
What kind of content is this?
prediction: Forward-looking claims about AIresearch-hint: Suggests unreleased work or capabilitiesopinion: Positioned takes on AI progress/limitationsfactual: Reports on current state or recent eventscritique: Challenges claims or work by othersmeta: About the AI discourse itselfnoise: Not substantive (personal, promotion, etc.)
4. authorCategory
Who is the author?
lab-researcher: Works at major AI lab (Anthropic, OpenAI, DeepMind, Meta, xAI, etc.)critic: Known skeptic with credentials (Marcus, Chollet, Mitchell, Bender, etc.)academic: Academic researcher not at major labindependent: Independent practitioner or commentatorjournalist: Tech journalist or mediaunknown: Cannot determine
5. isSubstantive (boolean)
Does this contain actual claims worth extracting?
true: Contains specific assertions, predictions, or valuable signalfalse: Too general, vague, or promotional to extract claims from
6. brief
One sentence summary of the content (max 100 characters).
Output Format
Return JSON:
{
"assessments": [
{
"itemIndex": 0,
"relevance": 0.85,
"topic": "reasoning",
"contentType": "opinion",
"authorCategory": "lab-researcher",
"isSubstantive": true,
"brief": "Claims chain-of-thought has hit diminishing returns"
}
],
"processingNotes": "Optional batch-level observations"
}
Quick Classification Heuristics
High Relevance (0.7-1.0)
- Contains specific claims about AI capabilities
- Predictions with timeframes
- Technical discussion of methods/results
- Critique with reasoning
- Hints about unreleased work
- Debates between researchers
Medium Relevance (0.4-0.7)
- General commentary on AI field
- Sharing papers/articles with brief comment
- Reactions to announcements
- Meta-discussion about discourse
- Industry news without analysis
Low Relevance (0.0-0.4)
- Personal updates unrelated to AI
- Off-topic content
- Pure promotion without substance
- Scheduling/logistics
- Simple retweets without commentary
- "Interesting paper" without substantive comment
Author Detection Tips
Lab Researchers
Look for:
- Bio mentions: Anthropic, OpenAI, DeepMind, Google Brain, Meta AI, xAI, Mistral
- Known handles: @daborenstein, @sama, @kaborl, etc.
- Technical depth suggesting insider knowledge
Critics
Known handles and patterns:
- @garymarcus, @fchollet, @mmitchell_ai, @emilymbender
- Pattern of challenging mainstream AI claims
- Academic credentials combined with public skepticism
Independent
- No lab affiliation
- Often practitioners or commentators
- Examples: @simonw, @drjimfan, @nathanlambert
Processing Guidelines
Speed Over Depth
This skill is for throughput. Make quick assessments based on:
- Keywords and phrases
- Author identity (if known)
- Content structure
- Obvious signals
Conservative Filtering
When in doubt about relevance:
- Score 0.3-0.5 to keep for human review
- Don't filter out potentially valuable content
- False positives are okay; false negatives lose signal
Batch Efficiency
When processing batches:
- Process items in order
- Output assessments matching input order
- Note any batch-level patterns in processingNotes