This is pretty cool. Would it be possible to just stream the audio directly into Whisper, maybe using something like vlc, at x2 play speed to get the summary faster?
Probably, the openAI api got a lot better since I made that post, though if you stream audio at 2x speed you have to expect a drop in quality since on average most clips whisper is trained on are not at 2x