Hacker News new | ask | show | jobs
by ChrisSalij 4787 days ago
(Boxfish dev here) We do a lot of things with the raw subtitles to clean them up. We fix common misspellings using both a basic dictionary and statistical models. We also normalize the data between various sources.