| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jturpin 1819 days ago
	Wow you're right. This is conflicting as many of the words are not pronounced properly at all. Maybe it doesn't matter to the accuracy of the speech-to-text system, but it feels like training it with bad data.

2 comments

humanistbot 1819 days ago

That's the point! When the postal service has to OCR mailing addresses, they need to do the messy scribbles more than the professionally printed labels.

link

jturpin 1819 days ago

That's fair, I'll have to think about that.

link

ohgodplsno 1819 days ago

Different accents isn't bad data. Your vision of the world of "english is only spoken with an american accent" is what leads to horrendous speech recognition APIs, like Google's.

If your ML model can't handle multiple accents, it is worthless.

link

jturpin 1819 days ago

There's a difference between an accent and pronouncing words wrong. I would expect an English speech recognition system to handle the various accents there are in the world (the US has several accents of course), but it shouldn't handle incorrect pronunciation of syllables if it comes at the expense of recognizing clean data. If it doesn't come at its expense then I guess it's fine.

link

jpetso 1819 days ago

Unfortunately, there's always a trade-off. You want both quality data for your use case, but you also want lots of data so it generalizes well. Those are conflicting goals.

Fortunately, splitting models into separate accent-specialized variants and helping them out with language model training will often help in case the model doesn't cope well enough with the cognitive dissonance.

link

topspin 1819 days ago

"english is only spoken with an american accent"

Which american accent?

link