Maybe I will put in some mechanism to prevent that but for now I just want to see if people could find it useful. I also have the code open source and will write tutorials for people to put up their own instance as well
It would be more useful if one could directly paste links to videos online as well. But yeah, in general this is extremely useful. I'm looking forward to video site integration. Would be great if youtube could finally retire their horrible auto caption function for something that actually works. Being able to easily watch media in different languages from around the world will be an absolute game changer.
I also plan to support automatic language translation I have that working locally already actually, and I work for one of the big alt-video platforms and rumour has it that I will be shipping this feature for them soon (auto transcription with auto translated subtitles)
Yeah it's all ready to go using LibreTranslate, they have about 25 languages, maybe I'll finish that this weekend and put it up, it's really inexpensive to make the translations compared to making the original transcription so may as well. Coming soon!
Also, I have that tested (auto download) with YouTube-Dl, it works fine but haven't put it into the frontend, but may as well, it helps a lot on your own instance so you don't have to download it first and then upload it
I setup the server to only transcribe two files at a time, so yeah someone could abuse it for sure with two big uploads and stick everyone else on the queue. But for me, even a 3 hour video translates with large model in about ~30 minutes so it wouldn't be too bad, but hopefully everyone is conscious to not do that, so far nobody has abused it which is cool.
Me again - why two at a time? In my initial testing with whisper-asr-webservice and my RTX 3090 I could pretty easily throw ~10 different files at it simultaneously as there is some natural staggering between API entry, CPU conversion/resampling/transcoding of audio, the actual audio length, network effects like upload speed, etc.
I also implemented some anti-abuse-ish features between traefik and Cloudflare that should help it stand up better in the face of bad actors abusing it.
Certainly not something to necessarily depend on but I thought I'd mention it.
> I am just paying for a somewhat expensive server and I love how it's really fast but also I have a lot of free GPU time so might as well let others use it too lol.