Hacker News new | ask | show | jobs
by eigenvalue 709 days ago
They are almost certainly extracting the audio and then using Whisper or other superior speech recognition models. I made a free tool which can do this very efficiently for whole playlists of YouTube videos, so I'm sure they can do the same:

https://github.com/Dicklesworthstone/bulk_transcribe_youtube...