| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by filterfiber 947 days ago
	I don't think the article mentions it, how well does the rpi 4 and 5 do for inference with whisper especially v3?

3 comments

coder543 947 days ago

v3 only comes in one flavor: large.

I don’t think you’re going to have a good time running the large model on a Pi of any kind.

The large models are 32x slower than the tiny models, roughly.[0]

I just tested, and whisper.cpp on my Pi 4 can transcribe the 30-second a13.wav sample (“make samples” to fetch it) in 18.5 seconds.

You can do the math… 32x = 10 minutes transcribe 30 seconds of audio with the large model. Not a good time for most people.

The Pi 5 could be 2x to 3x faster.

[0]: https://github.com/openai/whisper/blob/main/README.md#availa...

link

danieljanes 947 days ago

I can confirm that we're seeing 2x to 3x faster (RPi 4 vs RPi 5) in some of our early tests

link

jafermarq 947 days ago

yes. Finetuning a whisper model on a RPi 5 is ~2x faster than on the RPi 4. Other stages involving data pre-processing with HF dataset is again 2x-3x faster.

link

danieljanes 947 days ago

One of the Flower maintainers here, we're planning to follow up with a more in-depth performance comparison soon

link

a_wild_dandan 947 days ago

I’m also interested in peoples’ experience. I’d expect decent performance: Whisper 3 has many model sizes, down to 35Mb, iirc. Training, and especially inference, should be doable on a Pi5.

link

kkielhofner 947 days ago

> Whisper 3 has many model sizes

Nitpick but important - Whisper v2 and v3 are large only. It's actually the same Whisper but the version of the model (large-v2, large-v3) has been updated.

All of the other model sizes are the original release.

link

a_wild_dandan 946 days ago

I reread your comment multiple times and still don’t understand the important nitpick. Are you saying that the smaller models haven’t been updated alongside the Whisper 3 release? That makes the most sense to me, but I don’t want to misinterpret what you mean!

link

account17 946 days ago

They only released the "large" model of both v2 and v3, the tiny model is v1

link

jafermarq 947 days ago

Yes. The example uses Whisper-tiny which is 39M, a perfect match for the downstream task of keyword spotting. Just one line needs to be changed in the code to run a larger Whisper model :)

link