Hacker News new | ask | show | jobs
by gcormier 93 days ago
In playing with the exact same use case, I was blown away at how good Gemini (flash 2.5 IIRC) transcoded podcasts with speaker identification and handled common "overlaps" in conversations. I can't remember what local Ollama models I played with but was not very impressed.
1 comments

Yeah, Gemini is really strong at speaker separation and handling overlaps.

I’m taking a local-first approach (privacy, offline, no cost), using Faster-Whisper