Hacker News new | ask | show | jobs
by userhacker 1204 days ago
I suggest you give revoldiv.com a try, We use whisper and other models together. You can upload very large files and get an hour long file transcription in less than 30 seconds. We use intelligent chunking so that the model doesn't lose context. We are looking to increase the limit even more in the coming weeks. It's also free to transcribe any video/audio with word level timestamps.
1 comments

I just gave it a try, and the results are impressive! Do you also offer an API?
If you're interested in an offline / local solution: I made a Mac App that uses Whisper.cpp and Voice Activity Detection to skip silence and reduce Whisper hallucinations: https://apps.apple.com/app/wisprnote/id1671480366

If it really works for you, I can add command line params to an upate, so you can use it as a "local API" for free.

contact us at team@revoldiv.com and we are offering an API on a case by case basis