| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mikeravkine 1028 days ago
	One caveat here is that whisper.cpp does not offer any CUDA support at all, acceleration is only available for Apple Silicon. If you have Nvidia hardware the ctranslate2 based faster-whisper is very very fast: https://github.com/guillaumekln/faster-whisper

3 comments

kkielhofner 1028 days ago

ctranslate2 is amazing, I don’t know why it doesn’t get more attention.

We use it for our Willow Inference Server which has an API that can be used directly like OP project and supports all Whisper models, TTS, etc:

https://github.com/toverainc/willow-inference-server

The benchmarks are pretty incredible (largely thanks to ctranslate2).

link

stavros 1028 days ago

Obligatory hooking up of Willow to ChatGPT, for the best virtual assistant currently available:

https://twitter.com/Stavros/status/1693204822042739124

link

rebeccaskinner 1028 days ago

I haven’t used faster-whisper so I can’t compare performance, but whisper.cpp does support cuda via CUBLAS, and it’s noticeably faster than the cpu version. I used it earlier this year to generate subtitles for 6 seasons of an old tv show I backed up from dvd that didn’t include subtitles on the disc.

link

inciampati 1028 days ago

Thanks for the Nvidia based implementation!

Fwiw decent acceleration works on any avx2 compatible chipset. I get realtime speed for everything but the large models with a recent Ryzen system. The apple silicon is good but not as special as folks think!

link