|
|
|
Show HN: A JSON API for YouTube Transcript with MCP Support
(transcriptapi.com)
|
|
2 points
by nikhonit
171 days ago
|
|
I built this because running yt-dlp in production (especially on serverless/Vercel) is a nightmare of IP blocks, cold starts, and binary dependencies. TranscriptAPI is a lightweight wrapper that handles the extraction, formatting, and proxy rotation. It prioritizes manual captions over auto-generated ones and returns clean JSON with timestamps, ready for RAG pipelines. The MCP (Model Context Protocol) Integration: I recently added native MCP support. If you use Claude Desktop or other MCP-compliant agents, you can add this API as a tool to 'watch' videos directly in your chat context without manually copying transcripts. Technical Stack: Backend: Python (FastAPI) on AWS Lambda (for burst scaling) Caching: Redis (to prevent hitting YouTube for the same video twice) Challenge: Handling 'drifting' timestamps in long livestreams where the auto-generated captions lose sync with the video frame. It has a free tier for hobbyists. I’m curious to hear how you’re handling the context-window limits when feeding full 3-hour transcripts to LLMs |
|