|
|
|
|
|
by YeGoblynQueenne
251 days ago
|
|
>> Now we don't know if we're generalising or memorising. "Now" starts around 1980 I'd say. Everyone in the field tweaks their models until they perform well on the "held-out" test set, so any ability to estimate generalisation from test-set performance goes out the window. The standard 80/20 train/test split makes it even worse. I personally find it kind of scandalous that nobody wants to admit this in the field and yet many people are happy to make big claims about generalisation, like e.g. the "mystery" of generalising overparameterised neural nets. |
|
You can still game those benchmarks (tune your hyperparameters after looking at test results), but that setting measures for generalisation on the test set _given_ the training set specified. Using any additional data should be going against the benchmark rules, and should not be compared on the same lines.