| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by xyzzy123 1120 days ago
	If someone can show GPT-4 is "reasoning" (for some meaningful definition of that) in specific scenarios, surely counter-examples do not disprove this.

2 comments

chaxor 1120 days ago

There are substantial works already showing reasoning capabilities in GPT-4, which show that these models do reason extremely well - near human performance for many causal reasoning tasks. (1) Additionally, there is a mathematical proof that these systems align with dynamic programming, and therefore can do algorithmic reasoning. (2,3)

1) https://arxiv.org/abs/2305.00050.pdf 2) https://arxiv.org/pdf/1905.13211.pdf 3) https://arxiv.org/pdf/2203.15544.pdf

pas 1119 days ago

is GPT4 a graph neural network? also, isn't it training time and data dependent how big (how many tokens) a problem it can tackle?

so it's great that it can reason better than humans on small-medium probems already well trained for, but so far Transformers are not reasoning (not doing causal graph analysis, or not even doing zero order logic), they are eerily well writing text that has the right keywords. and of course it's very powerful and probably will be useful for many applications.

chaxor 1119 days ago

They are GNNs with attention as the message passing function and additional concatenated positional embeddings. As for reasoning, these are not quite 'problems well-trained for', in the sense that they're not in the training data. But they are likely problems that have some abstract algorithmic similarity, which is the point.

I'm not quite sure what you mean that they cannot do causal graph analysis, since that was one of many different tasks provided in the various different types of reasoning studies in the paper I mentioned. In fact it may have been the best performing task. Perhaps try checking the paper again - it's quite a lot of experiments and text, so it's understandable to not ingest all of it quickly.

In addition, if you're interested in seeing further evidence of algorithmic reasoning capabilities occurring in transformers, Hattie Zhou has a good paper on that as well. https://arxiv.org/pdf/2211.09066.pdf

The story is really not shaping up to be 'stochastic parrots' if any real deep analysis is performed. The only way that I see someone could have such a conclusion is if they are not an expert in the field, and simply glance at the mechanics for a few seconds and try to ham handedly describe the system (hence the phrase: "it just predicts next token"). Of course, this is a bit harsh, and I don't mean to suggest that these systems are somehow performing similar brain-like reasoning mechanisms (whatever that may mean) etc, but stating that they cannot reason (when there is literature on the subject) because 'its just statistics' is definitely not accurate.

pas 1119 days ago

> they cannot do causal graph analysis

I mean the ANN in the inference stage when run does not draw up a nice graph, doesn't calculate weights, doesn't write down pretty little Bayesian formulas, it does whatever is encoder in the matrices-innerproduct-context.

And it's accurate in a lot of cases (because there's sufficient abstract similarity in the training data), and that's what I meant by "of course it'll likely be useful in many cases".

At least this is my current "understanding", I haven't had time to dig into the papers unfortunately. Thanks for the further recommendation!

What seems very much missing is characterizing the reasoning that is going on. Its limitations, functional dependencies, etc.

jxf 1120 days ago

If a counterexample to a specific claim doesn't disprove the claim, that sometimes suggests the claim is unfalsifiable and therefore suspect.

fasterik 1120 days ago

The claim is that GPT-4 can reason sometimes. Evidence that GPT-4 fails to reason sometimes isn't a counterexample.

krainboltgreene 1120 days ago

The people who made GPT-4 have said it does not reason, please for the love of god drop this nonsense.

fasterik 1120 days ago

I never said that it does. I was pointing out a logical flaw in that person's argument. Also, why are the creators of GPT-4 authorities on what does or doesn't count as reasoning?

travisjungroth 1120 days ago

It’s suspect until it’s demonstrated. Once someone has demonstrated it, counterexamples are meaningless.

I claim I can juggle. I pick up three tennis balls and juggle them. You hand me three basketballs. I try and fail. My original claim, that I can juggle, still stands.

physPop 1120 days ago

I disagree -- that would disprove your claim, as your claim was too broad. Same if they handed you chainsaws or elephants, or seventy-two tennis balls.

The more correct claim is you can juggle [some small number of items with particular properties].

travisjungroth 1120 days ago

Normal English implies that you can do something, not everything. It’s an any versus all distinction, and all is totally unreasonable except for the most formal circumstances.

“Can you ride a bike?”

“Yeah.”

“Prove it. Here I have the world’s smallest bicycle.” <- this person is not worth your time and attention.

alienicecream 1120 days ago

It can't even count reliably. And this is a computer, not a human. That is one of the simplest things a computer should be able to do. It can't count because it doesn't know what counting is, not because it's unreliable in the way a human would be when counting. You cannot reason if you do not understand the concepts you are working with. The result is not the measure of success here, because it is good at mimicking, but when it fails at such a basic computing task, you can reasonably conclude it has no idea what it's doing.

travisjungroth 1120 days ago

Think about it step by step. There are people not able to count. We still say they can reason. A low ability to count does not disprove reasoning.

pxc 1120 days ago

That's because

> I can juggle

is here shorthand for

> I can juggle at all; I can juggle at least some things

and the basketball case is only a counterexample to the much stronger claim

> I can juggle anything

But the argument about AIs reasoning has little to do with such examples, because juggling is about the ability to complete the task alone. When it comes to reasoning there are questions about authenticity that don't have analogs I'm determine whether a person can juggle.

travisjungroth 1120 days ago

This thread is exactly such examples.

What would “not alone” mean? Do you think someone is passing it the answers? Of course it was trained, but that’s like cheating on a test by reading the material so you can keep a cheat sheet in your brain.