Hacker News new | ask | show | jobs
by trimethylpurine 145 days ago
That doesn't appear to be what happened. But the marketing sure has a lot of people working quick to presume so.

I would guess it's only a matter of days before that proof, or one very similar, is found in the training data, if that hasn't happened already, just as has been the case every time.

No fundamental change in how the LLM functions has been made that would lead us to expect otherwise.

Similar "discoveries" occurred all the time with the dawn of the internet connecting the dots on a lot of existing knowledge. Many people found that someone had already solved many problems they were working on. We used to be able to search the web, if you can believe that.

The LLMs are bringing that back in a different way. It's functional internet search with an uncanny language model, that sadly obfuscates the underlying data while making guesswork to summarize it (which makes it harder to tell which of its findings are valuable, and which are not).

It's useful for some things, but that's not remotely what intelligence is. It doesn't literally understand.

>* if you bring a GPT5-class LLM, you can walk away with a gold medal without having any idea what you're doing.*

My money won't be betting on your GPT5-class business advice unless you have a really good idea what you're doing.

It requires some (a lot of) intelligence and experience to usefully operate an LLM in virtually every real world scenario. Think about what that implies. (It implies that it's not by itself intelligent.)

1 comments

You need to read the IMO papers, seriously. Your outlook on what happened there is grossly misinformed. No searching or tool use was involved.

You cannot bluff, trick, or "market" your way through a test like that.

I didn't say anything about cheating. In fact, if it did cheat, that would make for a much stronger argument in your favor.

If scoring highly on an exam implies intelligence then certainly I'm not intelligent and the Super Nintendo from the 90s is more sentient than myself, given I'm terrible at chess.

I personally don't agree with that definition, nor does any dictionary I'm familiar with, nor do any software engineers with whom I'm familiar, nor any LLM specialists, including the forefront developers at OpenAI, xAI, Google, etc. as far as I'm aware.

But for some reason (it's a very obvious reason $$$), marketers, against the engineers' protest, appear to be claiming otherwise.

This is what you're up against and what you'll find the courts, and lawyers, will go by when this comparison comes to a head.

In my opinion, I can't wait for this to happen.

Thrilled to know if I shouldn't wait for that. If you're directly involved with some credible research to the contrary, I would love to hear more.

But IMO, in this case at least, has nothing to do with intelligence. It's performing a search against its own training data, and piecing together a response in line with that data, while including the context of the search term (aka the question). This is run through a series of linear regressions, and a response is produced. There is nothing really groundbreaking here, as best I can tell.

These arguments usually seem to come down to disagreements about definitions, as you suggest. You've talked about what you don't consider evidence of intelligence, but you haven't said anything about the criteria you would apply. What evidence of intelligent reasoning would change your mind?

It is unsupportable to claim that ML researchers at leading labs share your opinion. Since roughly 2022, they understand that they are working with systems capable of reasoning: https://arxiv.org/abs/2205.11916

Based on an English dictionary definition, I would expect an intelligence exhibits understanding, don't you? I would hope people are reading the dictionary before they market a multibillion dollar product set to reach the masses. It seems irresponsible not to.

The article you linked discussed reasoning. That's really cool. But, consider that we can say that a chess game computer opponent is reasoning. It's using a preprogrammed set of instructions to predict out to some number of possible moves ahead, and choosing the most reasonable. A calculator, essentially, it is in fact reasoning. But that doesn't have much to do with intelligence. As we read in the dictionary, intelligence implies understanding, and we certainly can't say that the Chess Masters opponent from the Super Nintendo literally understands me, right?

More to the point, I don't see that any LLM has thus far exhibited remotely any inkling of understanding, nor can it. It's a linear regression calculator. Much like a lot of TI84 graphing calculators running linear algebraic functions on a grand scale. It's impressive that basic math can achieve results across word archives that sound like a person, but it's still not understanding what it outputs, and really, not what it inputs beyond graphing it algebraically either.

It doesn't literally understand. So, it is not literally intelligent, and it will require some huge breakthroughs to change that. I very much doubt that such a discovery will happen in our lifetime.

It might be more likely that the marketers will succeed in revising the dictionary. We've seen often times that if you use words wrong enough, it becomes right. But so far at least, that hasn't happened with this word.

OK, now let's talk about what it means to "understand" something.

Let's say a kid who's not unusually gifted/talented at math somehow ends up at the International Math Olympiad. Smart-enough kid, regularly gets 4.0+ grades in normal high school classes, but today Timmy got on the wrong bus. He does have a great calculator in his backpack -- heck, we'll give him a laptop with Mathematica installed -- so he figures, why not, I'll take the test and see how it goes. Spoiler: he doesn't do so well. He has the tools, but he lacks understanding of how and when to apply them.

At the same time, the kid at the next desk also doesn't understand what's going on. She's a bright kid from a talented family -- in fact Alice's old man works for OpenAI -- but she's a bit absent-minded. Alice not only took the wrong bus this morning, but she grabbed the wrong laptop on the way out the door. She shrugs, types in the problems, and copies down what she sees on the screen. She finishes up, turns in the paper, and they give her a gold medal.

My point: any definition of "understanding" you can provide is worthless unless it can somehow account for the two kids' different experiences. One of them has a calculator that does math, the other has a calculator that understands math.

I very much doubt that such a discovery will happen in our lifetime.

So did I, and then AlphaGo happened, and IMO a few years later. At that point I realized I wasn't very good at predicting what was and was not going to be possible, so I stopped trying.

Calculators do not understand math, while both kids understand each other and the world around them. The calculator relies on an external intelligence.

Don't stop trying. Predictability is an indicator of how well a theory describes the universe. That's what science is.

The engineers have long predicted this stuff. LLM tech isn't really new. The size and speed of the machines is new. The more you understand about a topic, the better your predictions.