Hacker News new | ask | show | jobs
by rbanffy 1623 days ago
A speech recognition model can give you a reading on how understandable the speech is and use that information to guide the channel volume in the mixing.

OTOH, a lot of the models end up trained on features that are very different from what humans hear.