|
|
|
|
|
by refreshingdrink
732 days ago
|
|
Also worth nothing that Ryan mentions > In addition to iterating on the training set, I also did a small amount of iteration on a 100 problem subset of the public test set and > it's unfortunate that these sets aren’t IID: it makes iteration harder and more confusing It’s not unfortunate: generalizing beyond the training distribution is a crucial part of intelligence that ARC is trying to measure! Among other reasons, developing with test-set data is a bad practice in ML because it hides the difficulty this challenge. Even worse, writing about a bunch of tricks that help results on this subset is extending the test-set leakage the blog post's readers. This is why I'm glad the ARC Prize has a truly hidden test set |
|
Because the thing we have now is data-hungry. Your brain is pre-trained on other similar challenges as well. What's the point of requiring it to "generalize beyond the training distribution" with so few samples?
Really, I thought LLMs ended this "can we pretrain on in-house prepared private data for ILSVRC" flame war already.