| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by frisco 3983 days ago

My problem isn't that the feature engineering is expensive or tedious, it's that it's privileging a lot of information that NNs learn from the data. Yeah ok, Markov models (n-grams) are simple and fast and produce good results for generating representative text.

Deep RNNs are simple and produce good results for a huge, diverse range of problems with no new domain information. As Andrej Karpathy wrote:

> Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times.

N-grams don't have nearly the power (eg longer-than-N-range structure like grammar) and don't generalize nearly as well, making them a lot less surprising.