|
|
|
|
|
by fchollet
585 days ago
|
|
Right, one rather uninteresting line of approaches to ARC consists of trying to anticipate what might be in the test set, by generating millions of synthetic tasks. This can only work on relatively simple tasks, since the chance of task collision (between the test set and what you generate) is very low for any sophisticated task. ARC 2 will improve on ARC 1 by making tasks less brute-forceable (both in the sense of making in harder to find the solution program by generating random programs built on a DSL, and in the sense of making it harder to guess the test tasks via brute force task generation). We'll keep the human facing difficulty roughly constant, which will be controlled via human testing. |
|
Any plan for more validation data to match the new harder testset?