| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by AustinDev 128 days ago
	Audio models are also tiny, which is probably why small labs are doing well in the space. I run a LoRA'd Whisper v3 Large for a client. We can fit 4 versions of the model in memory at once on a ~$1/hr A10 and have half the VRAM leftover. Each of the LoRA tunes we did took maybe 2-3 hours on the same A10 instance.

1 comments

freedomben 128 days ago

Is Whisper still getting nontrivial development? I was under the impression that it had stagnated, but it seems hard to find more than just rumors

link

AustinDev 127 days ago

My ~1.7% WER and faster than realtime processing in my application make it more than adequate. My application is multi-speaker with WPM rates >300 for long durations.

link