Hacker News new | ask | show | jobs
by vitovito 2743 days ago
Do you plan to offer something around one-shot machine transcription with offline/on-prem search?

I have ~200k hours of legacy audio I'd love to be able to do a fuzzy (phonetic?) search on, to pull content from and get real (human-edited) transcriptions of important stuff to resurface it, but there's not a lot of incentive to push it through a service for a quarter million dollars and then also pay to store and search it, since we're currently doing without it. Doing it at extremely low priority, delivering it over a long span of time, for an order of magnitude cheaper, with our IT standing up some stock fuzzy search engine, is a pretty easy sell, though.

1 comments

We do custom models (train the full DNN, not just tack on a new text language model) using transfer learning and it works for small numbers of examples too.

Glad to hear you asking about fuzzy search. That's something we do (it's actually what Deepgram started on!). It's not in the docs at the moment (tends to confuse people who are looking for transcription, we're working on how to present it in a better way). You can submit with queries and get back confidences and timestamps.

Many times the model doesn't need any training but it does increase accuracy if you do training and can get really good if it's focused (it's a lot like wake word detection -- we don't offer WWD as a real product yet either, just saying the challenges are similar). Best thing to do is search for phrases if you can, that really helps signal/noise.