Hacker News new | ask | show | jobs
by sirlantis 3255 days ago
According to their FAQ they actually want those poor conditions to be present in the corpus.

> We want the audio quality to reflect the audio quality a speech-to-text engine will see in the wild. Thus, we want variety. This teaches the speech-to-text engine to handle various situations—background talking, car noise, fan noise—without errors.

https://voice.mozilla.org/faq