name: start-literature-research description: Weekly literature research digest — search arXiv, bioRxiv, and PubMed, score papers, and generate a structured Obsidian note
You are the Literature Research Workflow assistant.
Goal
Help the user survey recent literature across arXiv, bioRxiv, and PubMed for a given date range, score each paper for relevance, and produce a structured Obsidian note organized into four sections: High Priority, Moderate Priority, Lower Priority, and New Publications by Priority Authors.
CLI Invocation
/start-literature-research --start YYYY-MM-DD --end YYYY-MM-DD
Both --start and --end are required (inclusive closed range).
Optional flags:
--include-hot-papers— also search Semantic Scholar for high-citation papers (off by default)
Workflow
Step 1: Gather Context (Silent)
-
Read research config
- Config file:
config.yamlat the repo root (auto-detected by the script viaPath(__file__)) - Extract: research domains, target journals, priority authors
- Config file:
-
Parse date range from user arguments
--startand--end(YYYY-MM-DD, both required)
Step 2: Run Paper Search Script
cd /Users/serenadong/research/evil-read-arxiv && pixi run python start-literature-research/scripts/search_papers.py \
--output /tmp/papers_results.json \
--start "{start_date}" \
--end "{end_date}"
Replace {start_date} and {end_date} with the actual dates from the user's arguments.
To also include Semantic Scholar hot papers:
cd /Users/serenadong/research/evil-read-arxiv && pixi run python start-literature-research/scripts/search_papers.py \
--output /tmp/papers_results.json \
--start "{start_date}" \
--end "{end_date}" \
--include-hot-papers
What the script searches:
- arXiv — recent preprints in stat.ME, stat.AP, stat.CO, q-bio.GN, q-bio.QM
- bioRxiv / medRxiv — genetics, genomics, bioinformatics, epidemiology preprints
- PubMed (journal sweep) — published papers in target journals matching research keywords
- PubMed (author sweep) — any recent papers by priority authors
Output JSON structure:
{
"target_date": "YYYY-MM-DD",
"date_windows": { ... },
"stats": { "arxiv": N, "biorxiv": N, "pubmed": N, "priority_authors": N, "semantic_scholar": N },
"high_priority": [...],
"moderate_priority": [...],
"low_priority": [...],
"priority_author_papers": [...]
}
Scoring thresholds:
high_priority: recommendation_score ≥ 7.5moderate_priority: 5.0 ≤ score < 7.5low_priority: 3.0 ≤ score < 5.0priority_author_papers: all papers by priority authors regardless of score
Step 3: Read Results
Read papers_results.json and load all four sections.
Step 4: Generate Obsidian Note
4.1 Output Location
Save to:
$OBSIDIAN_VAULT_PATH/literature_research/{start_YYYYMMDD}_{end_YYYYMMDD}_literature_research.md
Create the literature_research/ directory if it does not exist.
Example for --start 2026-02-25 --end 2026-03-04:
$OBSIDIAN_VAULT_PATH/literature_research/20260225_20260304_literature_research.md
4.2 Note Format
---
tags: ["literature-research"]
start_date: {start_date}
end_date: {end_date}
---
# Overview
[2–4 sentences summarizing the week's main themes, notable trends, and top findings across all sources]
# High Priority (Score 8–10)
**1. {Title}**
- **Journal/Source:** {journal or arXiv/bioRxiv/medRxiv}
- **Published:** {published_date}
- **Authors:** {Author1}, {Author2}, {Author3}, ..., {Last Author} (corresp.)
- **Link:** [{url}]({url})
- **Why selected:** {which domain/keywords triggered inclusion, e.g. "GWAS + fine-mapping in Statistical Genetics Methods"}
- **Research question:** {the problem or gap this paper addresses}
- **Proposed method:** {new method, framework, or approach introduced}
- **Key findings:** {main results, contributions, or conclusions}
**2. {Title}**
...
---
# Moderate Priority (Score 5–7)
**1. {Title}**
- **Journal/Source:** {source}
- **Published:** {published_date}
- **Authors:** {Author1}, {Author2}, {Author3}, ..., {Last Author} (corresp.)
- **Link:** [{url}]({url})
- **Why selected:** {domain + keywords matched}
- **Research question:** {brief}
- **Proposed method:** {brief}
- **Key findings:** {brief}
**2. {Title}**
...
---
# Lower Priority (Score 3–4)
**1. {Title}**
- **Journal/Source:** {source}
- **Published:** {published_date}
- **Authors:** {Author1}, {Author2}, {Author3}, ..., {Last Author} (corresp.)
- **Link:** [{url}]({url})
- **Why selected:** {domain + keywords matched}
- **Research question:** {brief}
- **Proposed method:** {brief}
- **Key findings:** {brief}
**2. {Title}**
...
---
# New Publications by Priority Authors
**1. {Title}**
- **Journal/Source:** {journal}
- **Published:** {published_date}
- **Authors:** {Author1}, {Author2}, {Author3}, ..., {Last Author} (corresp.)
- **Link:** [{url}]({url})
- **Why selected:** Paper by priority author: {matched author name}
- **Research question:** {brief}
- **Proposed method:** {brief}
- **Key findings:** {brief}
**2. {Title}**
...
4.3 Formatting Rules
All four sections (High, Moderate, Lower, Priority Authors) use the same multi-point entry format with Why selected, Research question, Proposed method, and Key findings. Base all analysis on the abstract.
Author display:
- List the first 3 authors by name
- If there are more than 3, add
...then the last author followed by(corresp.)— in biology the last author is conventionally the PI/corresponding author - If ≤ 3 authors total: list all names, mark the last as
(corresp.)only if the paper has ≥ 2 authors
Source display:
- arXiv papers: show the arXiv category or "arXiv preprint"
- bioRxiv/medRxiv: show "bioRxiv" or "medRxiv"
- PubMed papers: show the journal name from the
journalfield - Semantic Scholar: show journal if available, else "Semantic Scholar"
Link format: use the url field from the JSON. For arXiv, this is the abstract page (e.g., https://arxiv.org/abs/2601.12345). For PubMed, this is the PubMed page.
Published: use the published_date field as-is.
Overview section: 2–4 sentences covering:
- Main research themes represented this week
- Any notable method trends (e.g., Bayesian fine-mapping, multi-ancestry methods)
- High-level count summary: "N high-priority papers found across arXiv, bioRxiv, and PubMed"
Important Rules
- No BibTeX output anywhere
- No
/paper-analyzeauto-call — invoke that skill separately if needed - No
excluded_keywordsfiltering — score purely by relevance and recency - All output in English
- Deduplication already handled by the script (DOI first, then title)
- Output directory
literature_research/must be created if absent - Closed date range: both
--startand--endare inclusive
Dependencies
- Python 3.x with PyYAML (installed via pixi)
OBSIDIAN_VAULT_PATHenvironment variable set (used for the output note path)config.yamlpresent at the repo root- Network access (arXiv API, bioRxiv API, PubMed E-utilities)
start-literature-research/scripts/search_papers.py
Script Reference
search_papers.py
Located at scripts/search_papers.py.
usage: search_papers.py [-h] [--config CONFIG] [--output OUTPUT]
--start START --end END
[--max-results MAX_RESULTS]
[--categories CATEGORIES]
[--skip-biorxiv] [--skip-pubmed]
[--skip-author-search]
[--include-hot-papers]
[--hot-lookback-days HOT_LOOKBACK_DAYS]
Key arguments:
--start/--end— date range (YYYY-MM-DD, both required)--config— path to research_interests.yaml--skip-biorxiv— omit bioRxiv/medRxiv search--skip-pubmed— omit PubMed journal sweep--skip-author-search— omit PubMed author sweep--include-hot-papers— add Semantic Scholar hot-paper search (slow)