Hacker News new | ask | show | jobs
by szupie 3535 days ago
Section 9 in the paper[1] is all about comparing these mistakes between the system and humans. The most common mistakes for humans and the system are in tables 9—11.

We find that the artificial errors are substantially the same as human ones with one large exception confusions between backchannel words [acknowledgment words like “uh-huh”] and hesitations.

The difference they found, but suspect might be a result of the different transcription guidelines of the training corpus: we see that by far the most common error in the ASR system is the confusion of a hesitation in the reference for a backchannel in the hypothesis. People do not seem to have this problem.

[1]: https://arxiv.org/abs/1610.05256

1 comments

Very cool that they investigated this! Thanks, I hadn't read the paper (obviously)