I wanted to build my own speech-to-text transcription program [1] for Discord, similar to how zoom or google hangouts works. I built it so that I can record my group's DND sessions and build applications / tools for VTTs (Virtual TableTop gaming).
It can process a set of 3-hour audio files in ~20 mins.
Thanks for building this. I am trying to set it up but facing this issu
> `torch` (v2.3.1) only has wheels for the following platforms: `manylinux1_x86_64`, `manylinux2014_aarch64`, `macosx_11_0_arm64`, `win_amd64`