Hacker News new | ask | show | jobs
by filterfiber 947 days ago
I don't think the article mentions it, how well does the rpi 4 and 5 do for inference with whisper especially v3?
3 comments

v3 only comes in one flavor: large.

I don’t think you’re going to have a good time running the large model on a Pi of any kind.

The large models are 32x slower than the tiny models, roughly.[0]

I just tested, and whisper.cpp on my Pi 4 can transcribe the 30-second a13.wav sample (“make samples” to fetch it) in 18.5 seconds.

You can do the math… 32x = 10 minutes transcribe 30 seconds of audio with the large model. Not a good time for most people.

The Pi 5 could be 2x to 3x faster.

[0]: https://github.com/openai/whisper/blob/main/README.md#availa...

I can confirm that we're seeing 2x to 3x faster (RPi 4 vs RPi 5) in some of our early tests
yes. Finetuning a whisper model on a RPi 5 is ~2x faster than on the RPi 4. Other stages involving data pre-processing with HF dataset is again 2x-3x faster.
One of the Flower maintainers here, we're planning to follow up with a more in-depth performance comparison soon
I’m also interested in peoples’ experience. I’d expect decent performance: Whisper 3 has many model sizes, down to 35Mb, iirc. Training, and especially inference, should be doable on a Pi5.
> Whisper 3 has many model sizes

Nitpick but important - Whisper v2 and v3 are large only. It's actually the same Whisper but the version of the model (large-v2, large-v3) has been updated.

All of the other model sizes are the original release.

I reread your comment multiple times and still don’t understand the important nitpick. Are you saying that the smaller models haven’t been updated alongside the Whisper 3 release? That makes the most sense to me, but I don’t want to misinterpret what you mean!
They only released the "large" model of both v2 and v3, the tiny model is v1
Yes. The example uses Whisper-tiny which is 39M, a perfect match for the downstream task of keyword spotting. Just one line needs to be changed in the code to run a larger Whisper model :)