| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jvanderbot 373 days ago

"The apple study" is being overblown too, but here it is: https://machinelearning.apple.com/research/illusion-of-think...

The crux is that beyond a bit of complexity the whole house of cards comes tumbling down. This is trivially obvious to any user of LLMs who has trained themselves to use LLMs (or LRMs in this case) to get better results ... the usual "But you're prompting it wrong" answer to any LLM skepticism. Well, that's definitely true! But it's also true that these aren't magical intelligent subservient omniscient creatures, because that would imply that they would learn how to work with you. And before you say "moving goalpost" remember, this is essentially what the world thinks they are being sold.

It can be both breathless hysteria and an amazing piece of revolutionary and useful technology at the same time.

The training set argument is just a fundamental misunderstanding, yes, but you should think about the contrapositive - can an LLM do well on things that are _inside_ its training set? This paper does use examples that are present all over the internet including solutions. Things children can learn to do well. Figure 5 is a good figure to show the collapse in the face of complexity. We've all seen that when tearing through a codebase or trying to "remember" old information.

1 comments

tough 373 days ago

I think apple published that study right before WWDC to have an excuse to not give bigger than 3B foundation models locally and force you to go via their cloud -for reasoning- harder tasks.

beta api's so its moving waters but that's my thoughts after playing with it, the paper makes much more sense in that context

link