If you're using Gemini in aistudio(not sure about the real-time API but everything else) then it has native audio input