Hacker News new | ask | show | jobs
by mrfox321 1740 days ago
So I was looking at SotA loss functions from a few years ago that weighted the CTC loss by the WER of the decoded phrase.

Could we generalize the WER weighting to optimize for the domain?

Something like

weight = w1 * WER + w2 * phonetic similarity + ...

which also requires a hyperparameter search... But we are already dumping so many GPU hours here.

I assume this is already being investigated by Google, though?

1 comments

There are some a similar techniques where you use evaluation metrics to decide which data to train on each epoch.

I wonder if you could make that parameter trainable instead of using a hyperparameter search for it.

For phonetic similarity I've been playing with a dual objective system that could be promising.