Hacker News new | ask | show | jobs
by Closi 1180 days ago
GPT4 can clearly already reason IMO (I mean it can play chess fairly well without ever being taught, or if you create a puzzle from scratch and tell it to it it can try to work it out and describe the logical approach it took). It’s definitely surprising that a next-word generator has developed the ability to reason, but I guess that’s where we are!

What is your definition of reasoning that you do not think GPT-4 would demonstrate signs of?

1 comments

> What is your definition of reasoning that you do not think GPT-4 would demonstrate signs of?

Heh, there have been many attempts to define reasoning. I haven't seen a good one yet.

However, I'm going to throw my hat into the ring, so be on the lookout for a blog post with that. I've got a draft and a lot of ideas. I'm spending the time to make it good.

Well GPT4 certainly fulfils the existing definitions of reasoning, so maybe you should call your thing something else instead of redefining ‘reasoning’ to mean something different?

Otherwise it’s just moving the goalposts.

GPT4 is certainly not fulfilling the definition of reasoning. It's borrowing the intelligence of every human who wrote something that went into its model.

To demonstrate this, ask it to prove something that most or all people believe. Say some "intuitive" math thing. Perhaps the fact that factorial grows faster than exponential functions.

And no, don't just have it explain it, have it prove it, as in a full mathematical proof. Give it a minimal set of axioms to start with.

Merriam-Webster's definition of "reasoning" [1] says that reasoning is:

> the drawing of inferences or conclusions through the use of reason

So starting GPT4 off with some axioms would give it a starting point to base its inferences on.

Then, if it does prove it, take away one axiom. Since you started with a minimal set, it should now be impossible for GPT4 to prove that fact, and it should tell you this.

Having GPT4 prove something with as few axioms as possible and also admit that it cannot prove something with too few axioms is a great test for if it is truly reasoning.

[1]: https://www.merriam-webster.com/dictionary/reasoning

In order for an AI to reason it doesn’t mean it has to be able to reason about everything at any level - most humans cant rediscover fundamental mathematical theorems from basic axioms, particularly if you keep removing them until they fail, but I don’t think that means most humans are unable to reason.

Take this problem instead which certainly requires some reasoning to answer:

“Consider a theoretical world where people who are shorter always have bigger feet. Ben is taller than Paul, and Paul is taller than Andrew. Steve is shorter than Andrew. Everyone walks the same number of steps each day. All other things being equal, who would step on the most bugs and why?”

I think it’s a logical error to say “AI can’t reason about this, so that proves that it can’t reason about anything at all” (particularly if that example is something most humans can’t do!). The LLMs reasoning is limited compared to human reasoning right now, although it is still definitely demonstrating reasoning.

> "Consider a theoretical world where people who are shorter always have bigger feet. Ben is taller than Paul, and Paul is taller than Andrew. Steve is shorter than Andrew. Everyone walks the same number of steps each day. All other things being equal, who would step on the most bugs and why?"

Because Ben is the tallest, his feet are the biggest, and because he takes the same amount of steps as the others, the amount of area he steps on is larger than the area that the others step on.

Therefore Ben is most likely to be the one to step on the most bugs.

Easy. And I'm not brilliant.

The problem with testing these tools is that you need to ask it a question that is not in their training sets. Most things have been proven, so if a proof is in its training set, the LLM just regurgitates it.

But I also disagree: if the "AI" can't reason about that, it can't reason because that one is so simple my pre-Kindergarten nieces and nephews can do it.

But even if not, the LLM's should have "knowledge" about exponential functions and factorial because the humans who wrote the material in their training sets did. So it's not a lack of knowledge.

And I claim that most humans could rediscover theorems from basic axioms; you've just never asked them to.

“In this theoretical world, shorter people have bigger feet. Given the information provided, we can deduce the following height order:

Ben (tallest) Paul Andrew Steve (shortest) Since shorter people have bigger feet in this world, we can also deduce the following order for foot size:

Steve (biggest feet) Andrew Paul Ben (smallest feet) Assuming that everyone walks the same number of steps each day and all other things being equal, the person with the biggest feet would be more likely to step on the most bugs simply because their larger foot size would cover a greater surface area, increasing the likelihood of coming into contact with bugs on the ground.

Therefore, Steve, who is the shortest and has the biggest feet, would step on the most bugs.”

GPT4 solved it correctly. You didn’t.