| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by manrajsingh 2241 days ago
	At our lab, we extensively work on problems that involve speech data. This includes tasks like speech recognition, speech scoring, emotion recognition, topic detection and speaker diarisation. Some of these tasks have public data available, while tasks like speech scoring and low-resource speech recognition, the data is fairly limited for supervised learning. Hence, we developed this annotation tool to generate corpus for our need.

2 comments

mkagenius 2241 days ago

In case still not clear, it does not do the transcription, it does not. Oh Hi Mark. It asks you to manually annotate it (in case you want to prepare a training data set for your algorithm), its not an AI algorithm.

link

jtbayly 2241 days ago

This is the most helpful comment here. I still don’t understand what the tool is for though. Up until now I assumed it would allow me to get automatic transcriptions, including breaking them down by speaker.

link

fluential 2239 days ago

I was looking into that space recently and I have used otter.ai for transcriptions which gives you 6000 minutes/month for 8 USD, which is insanely cheap in that space. Their British language model is quite good as well.

I’ve bulk exported generated srt/vtt files from my fav podcasts and using tinysearch that was posted here recently with ableplayer to provide audio full text search of my Jekyll published podcasts posts and with clickable timestamps to audio play of search phrases.

Whenever I want to know what podcaster has to say on specific subject a quick search makes such a difference!

link

jtbayly 2239 days ago

Awesome. Thanks for the info. I look forward to trying out your suggestion.

link

rock_artist 2241 days ago

So this tool is mostly a way to store your dataset?

Eg. doing things like force alignment should be done in other tools and use the api to put in the dataset?

link