We use this exact stack at work (OpenAI, ElevenLabs, Deepgram) for some exploratory use cases. The key issue we have now is latency with the LLM. Deepgram and Elevanlabs work brilliantly!
Problem with this is Deepgram's accuracy (but agree their speed/latency is excellent).
We used to use them too, but eventually we got so frustrated with poor accuracy we switched to Speechmatics - would definitely recommend checking them out.
We do live in-studio briefings 3x/wk. These are both in-person and live-broadcast. The first thing we did was add an AI Co-Briefer who sits on the panel. The LLM latency makes it a bit hard, but it was a good experiment. The Deepgram worked brilliantly well with transcription across the entire studio, even for un-microphoned guest participants.
That live broadcast created a lot of buzz and numerous other use cases have popped up across the company. I'm working on a tech blog showcase next week to show it off on HN hopefully!