Hacker News new | ask | show | jobs
by aDyslecticCrow 622 days ago
LSTM and GRU did not quite solve the issue, but they made it less bad. Overall, recurrent units are nutritiously prone to vanishing and exploding gradients.

I don't want to downplay the value of these models. Some people seem to be under the perception that transformers replaced or made them obsolete, which is faar from the truth.