|
|
|
|
|
by panchamk
305 days ago
|
|
Hi HN! I built this because I was tired of watching long videos when I just needed the information. Technical highlights:
- Runs OpenAI's Whisper entirely in the browser using ONNX/WebAssembly
- Videos never leave your device - all processing is client-side
- Smart YouTube caption extraction (faster than AI when available)
- Works with YouTube, Twitter/X, direct URLs, and local files
- Three model sizes to match your device's capabilities The biggest challenge was fitting Whisper models into browser memory constraints. I used quantized ONNX models and implemented chunked processing to handle large videos without OOM errors. Stack: Next.js, Transformers.js, FFmpeg.wasm, TypeScript Would love to hear your thoughts on:
- WebAssembly performance optimizations
- Handling larger models in browser memory
- The privacy-first approach vs server-side processing Happy to answer any questions! |
|