Y
Hacker News
new
|
ask
|
show
|
jobs
by
eigenvalue
709 days ago
They are almost certainly extracting the audio and then using Whisper or other superior speech recognition models. I made a free tool which can do this very efficiently for whole playlists of YouTube videos, so I'm sure they can do the same:
https://github.com/Dicklesworthstone/bulk_transcribe_youtube...