name: transcribe description: >- Transcribe audio files (podcasts, MP3s, interviews) using OpenAI Whisper. Use when the user wants to transcribe a podcast, audio file, or MP3. Also use when asked to "listen to" a podcast or audio. license: MIT metadata: author: spatie version: "0.0.1"
Audio Transcription
Transcribe audio files using OpenAI Whisper (installed via Homebrew).
Prerequisites
whisperCLI must be installed:brew install openai-whisper
How to transcribe
1. Get the audio file
If given a URL, download it:
curl -L -o /tmp/audio-file.mp3 "URL_HERE"
If given a podcast name/episode, search for the RSS feed or episode page to find the MP3 URL. Podcast hosting platforms like Transistor, Buzzsprout, Libsyn typically have direct MP3 URLs in their episode pages.
2. Run Whisper
whisper /tmp/audio-file.mp3 --model small --language en --output_dir /tmp/whisper-output --output_format txt
Available models (speed vs accuracy tradeoff):
tiny- fastest, least accuratebase- fast, decent accuracysmall- good balance (recommended default)medium- slower, better accuracylarge- slowest, best accuracy
For non-English audio, omit the --language flag or specify the correct language code.
3. Read the output
The transcript will be at /tmp/whisper-output/audio-file.txt
Tips
- For long files (1h+), use
smallmodel to keep it reasonable - For short files or when accuracy matters, use
mediumorlarge - The first run downloads the model weights, subsequent runs are faster
- Output formats available: txt, vtt, srt, tsv, json
- For subtitles, use
--output_format srt