| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Dzugaru 1217 days ago

It definitely can "understand" things is some way, however I'm pretty sure ReAct or similar will give it just a nudge forward and the underlying problem of it hallucinating and not being "lucid enough" is not so easily solved.

In the original ReAct paper it falls apart almost immediately in ALFWorld (this is a classical test for AI systems - to be able to reason logically - and it still isn't generally solvable due to combinatorial explosion).

For now it requires human correction looped or not, or else it "diverges" (I like Yann Lecun explanation [0]).

In my own experiments (I haven't played with LangChain or ReAct yet) it diverges irrecoverably pretty quickly. I was trying to explain to it the elementary combinators theory, in the style of Raymond Smullyan and his birds [1] and it can't even prove the first theorem (despite being familiar with the book). A human can prove it knowing almost nothing about math whatsoever, maybe it will take a couple of days thinking, but the correct proof is not that hard - just two steps.

[0] https://www.linkedin.com/posts/yann-lecun_i-have-claimed-tha...

[1] https://en.wikipedia.org/wiki/To_Mock_a_Mockingbird

2 comments

swid 1217 days ago

The exponential divergence formula is true but says less than you think - the same math would be true for human output; we are likely to find a few falsehoods in any non fiction book for the same reason, right?

link

Dzugaru 1216 days ago

Yeah I'm aware of the criticism of that tweet, it may not completely make sense mathematically, but I just liked it because it's how I feel about GPT4.

It "diverges" while my human mind seemingly is different in some way - I can keep going at the math problem forever (for much longer?) and I won't hallucinate incorrect proofs (at least very unlikely, and I can keep re-checking them).

Of course this all in the area of feelings and faith - we just don't know much about cognition I guess.

link

macrolime 1216 days ago

It performs much better when you add reflection to ReAct.

https://arxiv.org/pdf/2303.11366.pdf

link