Does it run well on CPU? I've used it locally but only with my high end (consumer/gaming) GPU, and haven't got round to finding out how it does on weaker machines.
That's pretty much exactly how I started. Ran whisper.cpp locally for a while on a 3070Ti. It worked quite well when n=1.
For our use case, we may get 1 audio file at a time, we may get 10. Of course queuing them is possible but we decided to prioritize speed & reliability over self hosting.