Hacker News new | ask | show | jobs
by padswo1 566 days ago
I don’t think ARC has particularly advanced the research. The approaches that are successful were developed elsewhere and then applied to ARC. Happy to be shown somewhere this is not the case.

In the case of TTT, I wouldn’t really describe that as a ‘new AGI reasoning approach’. People have been fine tuning deep learning models on specific tasks for a long time.

The fundamental instinct driving the creation of ARC - that ‘deep learning cannot do system 2 thinking’, is under threat of being proven wrong very soon. Attempts to define the approaches that are working as somehow not ‘traditional deep learning’ really seem like shifting the goal posts.

1 comments

Correct, fine-tuning is not new. It's long been used to augment foundational LLMs with private data. Eg. private enterprise data. We do this at Zapier, for instance.

The new and surprising thing about test-time training (TTT) is how effective it is an approach to deal with novel abstract reasoning problems like ARC-AGI.

TTT was pioneered by Jack Cole last year and popularized this year by several teams, including this winning paper: https://ekinakyurek.github.io/papers/ttt.pdf

How is TTT anything other than a deep learning algorithm? We have a deep learning model, we generate training data based on an example and use a stochastic gradient descent to update the model weights to improve its predictions according to the training data. This is a classic DL paradigm. I just don’t see why would you consider this an advancement if you your goal is to move “beyond” deep learning.