|
|
|
|
|
by rich_sasha
1730 days ago
|
|
It depends. It really doesn’t take that much data to train a pretty stunning (if simple) RNN character-level “language model” that beats any n-gram. Or on mnist. ANNs really are a useful tool for a vast class of problems, many of which can be solved with comparatively little data. Maybe your point stands, and it’s just that some domains need less data, just saying. |
|
For sure, it all depends on how robust the model needs to be, how strongly it needs to generalize. If your dataset covers the entire domain, you don't need a robust model. If you need strong generalization, then you need to build in stronger priors.
Take f(x) = x^2. If your model only needs to work in finite interval, you just need a decent sample that covers that interval. But if it needs to generalize outside that interval, no amount of parameters will give you good performance. Outside the boundaries of the interval, the NN will either be constant (with a sigmoid activation) or linear (with ReLU type activations).