Hacker News new | ask | show | jobs
by ALittleLight 1173 days ago
I take intelligence to be a general problem solving ability. I think that's close to what most people mean by the term. By that definition it's clear that LLMs do have some level of intelligence - in some dimensions greater, lesser, or within the range of human intelligence.

What definition do you have for intelligence and how do LLMs fail to meet it?

1 comments

It is not clear LLMs have a "general problem solving capability" at all. That's the entire point. That's a high bar!
What do you call being able to play chess and play any other well known game and do well on a battery of standardized tests and write code in a variety of languages in a variety of problems and ask questions and write fiction prose or poetry and generally just take a shot at anything you happen to ask.

I just can't take the idea that there is ambiguity as to whether these things have general problem solving skills seriously. They obviously do.

As I asked up-thread, if I had a chat window open with you, what's something you would be able to say or do that an unrestricted ChatGPT wouldn't?

I would be able to make a long list of things while maintaining logical consistency with things earlier the list. For instance, I asked ChatGPT-4 to create a schedule for a class, and it started off okay, but by the time it got to the end of the schedule, it started listing topics already covered. Really shows how it's just going off of statistics.
This is an example of ChatGPT performing poorly, but not being unable to do the thing. Nobody would say would say ChatGPT has human level intelligence across all domains - but that it has general problem solving ability. In other words, I'm saying it has an IQ, not that it has the highest possible IQ.

And, of course, there are domains where ChatGPT will do better than you. Since I don't know your skill set I don't know what those domains are, but I assume you'd agree. Just like ChatGPT giving a bad schedule doesn't disprove it's intelligence, you not being able to come up with acrostics or pangrams easily (or whatever) doesn't disprove yours.

You're just moving the goalposts.

GPT being bad this way, and being bad at "substitute words in all your responses" means it is leaking the abstraction to us. It's because of how its built and how it works. It means it isn't a general problem solving thing: it's a text prediction thing.

GPT is super impressive, I don't know how many times I need to say that, but it isn't intelligent, it doesn't understand the problem, and it doesn't seem like it ever will get there.

That's not moving the goalposts - it's exactly what I've said throughout this thread. GPT is better, worse, and within human ranges at different tasks - but it can do a wide range of tasks.

That GPT can solve a wide variety of problems, including problems it's never seen before, is literally the definition of intelligence and pointing out results where it underperformed is not even attempting to rebut that.

It’s not that it performs poorly, its that it performs poorly in a particularly leaky way. The error reveals its true nature.
Well, I dunno. Similar to Stockfish, wolfram alpha, etc.. I suppose! (tho seems it's much worse at specific problems than these tools are at those problems).

I'm not saying it isn't impressive! Just that it very much seems to be really good at finding out what text should come next. I don't think that's general problem solving!

Giving it a SQL schema and getting valid queries out of it is super impressive, but I have no idea what it was trained on.

> I just can't take the idea that there is ambiguity as to whether these things have general problem solving skills seriously. They obviously do.

It is not obvious to me this is the case! Often I will get totally wrong answers, and I won't be able to get the correct answer out of it no matter how hard I try.

> what's something you would be able to say or do that an unrestricted ChatGPT wouldn't?

Well, I'd ask you clarifying questions, for one! GPT doesn't do this type of stuff without being forced to, and even then it fails at it.

Also if you asked me to do something like "replace the word 'a' with the word 'eleven' in all your replies to me" I won't do weird garbage stuff, like reply with:

"ok11y I will repl11ce all words with the word eleven when using the letter 'a'"

lol