Hi Sam, I'm big fan of your work! Coincidently, I just made a simple POC video editor by editing text using this speech to text model https://huggingface.co/facebook/wav2vec2-large-960h-lv60-sel.... It might be cool to integrate into your Videogrep tool, it also works offline with CPU or GPU, and gives you timestamps for word or character level.