| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by vjerancrnjak 637 days ago

This is equivalent to the problem of maximum entropy Markov models and their application to sequence output.

After some point you’re conditioning your next decision on tokens that are severely out of the learned path and you don’t even see it’s that bad.

Usually this was fixed with cost sensitive learning or increased sampling of weird distributions during learning and then making the model learn to correct the mistake.

Another approach was to have an inference algorithm that maximize the output probability, but these algorithms are expensive (viterbi and other dynamic programming methods).

Feature modeling in NNs somewhat allowed us to ignore these issues and get good performance but they will show up again.