| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cocoa19 141 days ago
	Have you tried the Azure Speech Studio? I wonder how your custom model compares to this solution. I played around with python scripts for the same purpose. The AI gives feedback that can be transformed to a percentage of correctness. One annoyance is that for Mandarin, the percentage is calculated at the character level, whereas with English, it gives you a more granular score at the phoneme level.

1 comments

dirteater_ 141 days ago

IMO the SotA for this is https://www.speechsuper.com/. Amazon suffers for similar

> One annoyance is that for Mandarin, the percentage is calculated at the character level, whereas with English, it gives you a more granular score at the phoneme level.

This is the case for most solutions you'd find for this task. Probably because of the 1 character -> 1 syllable property. It's pretty straightforward to split the detected pinyin into initial+final and build a score from that though.

link