| HN Mirror

We haven't found a good provider yet to do this properly for our use case, but SpeakerText, Koemei and VoiceBase are examples of companies that offer these functionalities.

Unfortunately SpeakerText doesn't offer non-post-processed prices, Koemei integrated it into their own product and VoiceBase didn't offer post-processing on request, which we would need for integration into our product.

Which format will become mainstream probably depends on HTML5 adoption, which is detailed here http://www.3playmedia.com/how-it-works/how-to-guides/html5-v... Currently WebVTT seems to be in the lead.

Those formats don't accommodate for timestamps per spoken word though, which would be possible with machine transcription and which I would pay a premium for.