|
|
|
|
|
by meerab
237 days ago
|
|
I use a two-pass approach - first pass with ASR (OpenAI Whisper) and second pass with an LLM.
I ask users to provide context upfront and use that as the "initial_prompt" parameter in Whisper: https://github.com/openai/whisper/discussions/963#discussion... Gemini might have similar capabilities for custom vocabulary, though I'm not certain about their specific implementation. The two-pass ASR+LLM approach could work with Gemini's output as well. |
|