| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by js8 1189 days ago
	I don't accept that something is AGI unless it can solve general instances of SAT (satisfiability problem, not the school test). Also recognizing (formulating from the task) an instance in the first place would help too. To me, these are hallmarks of reason, and not available in LLMs, in fact probably impossible just with pattern recognition.

3 comments

pillefitz 1189 days ago

Can humans solve more than the most trivial SAT problems? Keep in mind, AGI does not imply superhuman intelligence.

link

sterlind 1189 days ago

can you solve general instances of SAT?

can the average person?

link

js8 1189 days ago

With enough patience, yes.

For example: You have a goat, a wolf, a cabbage and you want to cross a river...

link

pillefitz 1189 days ago

How would you do it, tree search? If yes, I tend to agree with your initial statement that one should be able to teach LLMs to apply simple heuristics before considering it AGI.

link

js8 1189 days ago

I don't know the answer to your question "how to build AGI". Although if I had to guess, the AGI will probably have a supervisor algorithm (trained by RL), which will issue internal commands to pattern matchers (like GPT-4), to drive them to solve the problem. The supervisor algorithm will only have a little tacit knowledge about any specific problem (like language or world facts), only tacit knowledge about learning and reasoning, and how to do it economically.

So the supervisor algorithm will do the tree search if needed.

link

vanviegen 1189 days ago

I would be surprised if a LLM wouldn't be able to do this in the same way humans would: brute force with a couple of early backtracking conditions.

It would have to think out loud though.

link

blueorange8 1189 days ago

That Goat wolf cabbage problem gpt-4 can solve already

link

js8 1189 days ago

Yes, but does it because it read somewhere? Can it adapt the existing solution to a new variation? Can it solve a similar problem with different things? This is what humans do all the time.

link

ofrzeta 1189 days ago

It can solve the variation "You have a rabbit, a wolf, a haypile and you want to cross a chasm". What kinds of variations do you have in mind?

link

js8 1189 days ago

If I add a small condition that makes the solution impossible, will it recognize that? Will it recognize for your example that it's a variation? Will it still be able to solve it when it is just a subtask of a bigger input?

If I ask it a leading question that intentionally relies on a wrong solution, will it recognize that?

link

antibasilisk 1189 days ago

Del Spooner getting offended panel

link

siva7 1189 days ago

That's fine. It's maybe not AGI IQ 180 to meet your demands but if we're honest it's pretty close.

link

js8 1189 days ago

Feynman had a really nice story about how he was into puzzles when he was at Princeton. It took him a while to solve the new ones, but eventually he learned all the well-known instances so he could answer instantly. It made him a genius in other people's eyes.

All I want from AGI is to demonstrate that it can solve a straightforward logic problems (puzzles, if you will), that it provably didn't see before. Or at least recognize it is being indirectly given such task. So far, evidence suggests it is not capable of that.

link

cjbprime 1189 days ago

There's a 150-page paper solely to describe instances of it doing that. It's the article attached to this comment thread.

link

js8 1189 days ago

Well, it's about the standard of the proof. When I say "demonstrate", I don't mean just experimentally, I mean theoretically, to show that the algorithm is capable of reasoning about potentially arbitrarily large instances of puzzles.

That's what the experiments have shown - once the unknown instance gets large enough, the reasoning of LLM breaks down. This is not the case with humans, who can, as noted elsewhere, do a tree search, form hypotheses, etc.

link

antibasilisk 1189 days ago

The paper in question demonstrates it doing exactly this with varying success.

link