| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by freejazz 982 days ago
	AI can reason! Just not reasonably!

1 comments

jiggawatts 982 days ago

It can reason better than most humans put into the same situation.

This problem doesn't result in a constant value, it results in a 3D probability distribution! Very, very few humans could work that out without tools. (I'm including pencil and paper in "tools" here.)

With only a tiny bit of coaxing, GPT 4 produced an animated video of the solution!

Try to guess what fraction of the general population could do that at all. Also try to estimate what fraction of general software developers could solve it in under an hour.

link

Jensson 982 days ago

A human could get a valid end state most of the time, gpt-4 seems to mess up more than it got it right based on the examples posted here. So to me it seems like gpt-4 is worse than humans.

Gpt-4 with help from a competent human will of course do better than most humans, but that isn't what we are discussing.

link

jiggawatts 982 days ago

> valid end state most of the time

I disagree. Don't assume "most humans" are anything like Silicon Valley startup developers. Most developers out there in the wild would definitely struggle to solve problems like this.

For example, a common criticism of AI-generated code is the risk of introducing vulnerabilities.

I just sat in a meeting for an hour, literally begging several developers to stop writing code vulnerable to SQL injection! They just couldn't understand what I was even talking about. They kept trying to use various ineffective hacky workarounds ("silver bullets") because they just didn't grok the the problem.

I've found GPT 4 outperforms median humans.

link

freejazz 982 days ago

>It can reason better than most humans put into the same situation.

On what basis do you allege this? People say the most unhinged stuff here about AI, and it so often goes completely unchallenged. This is a huge assertion that you are making.

link

jiggawatts 982 days ago

The equivalent of what current-gen LLMs do is an oral examination. Picture standing in the middle of a room surrounded by subject matter experts grilling you for your knowledge of various random topics. You have no tools, no calculator, no pencil and paper.

You’re asked a question and you just have to spit out the answer. No option to backtrack, experiment, or self correct.

“Translate this to Hebrew”.

“Is this a valid criticism of this passage from a Platonic perspective?”

“Explain counterfactual determinism in Quantum Mechanics.”

“What is the cube root of 74732?”

You would fail all of these. The AI gets 3 of 4 correct.

Tell me who’s smarter?

You because of your preconceptions, or because of real superiority?

Your model for human intelligence is probably more like this scene: https://youtu.be/KvMxLpce3Xw?si=Suy0Cj_pL0vru5Uj

The reality is the opposite. The AI could answer questions in this scenario but no nonfictional human could.

link

freejazz 982 days ago

It's just a completely baseless comparison the way you are going about it, and you are mistaking intelligence for the recitation of facts

>“Is this a valid criticism of this passage from a Platonic perspective?”

I haven't seen AI answering questions like this correctly at all

link