| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jjeaff 466 days ago
	I suspect AGI will be one of those things that you can't describe it exactly, but you'll know it when you see it.

7 comments

NitpickLawyer 466 days ago

> but you'll know it when you see it.

I agree, but with the caveat that it's getting harder and harder with all the hype / doom cycles and all the goalpost moving that's happening in this space.

IMO if you took gemini2.5 / claude / o3 and showed it to people from ten / twenty years ago, they'd say that it is unmistakably AGI.

link

bayarearefugee 466 days ago

There's no way to be sure in either case, but I suspect their impressions of the technology ten or twenty years ago would be not so different from my experience of first using LLMs a few years ago...

Which is to say complete amazement followed quickly by seeing all the many ways in which it absolutely falls flat on its face revealing the lack of actual thinking, which is a situation that hasn't fundamentally changed since then.

link

HdS84 466 days ago

Yes, thar is the same feelingg I have. Giving it some json and describe how a website should look? Super fast results and amazing capabilities. Trying to get it to translate my unit tests from Xunit to Tunit, where the latter is new and does not have a ton of blog posts? Forget about it. The process is purely mechanical and it is easy after RTFM, but it falls flat on its face

link

Closi 465 days ago

Although I think if you asked people 20 years ago to describe a test for something AGI would do, they would be more likely to say “writing a poem” or “making art” than “turning Xunit code to Tunit”

IMO I think if you said to someone in the 90s “well we invented something that can tell jokes, make unique art, write stories and hold engaging conversations, although we haven’t yet reached AGI because it can’t transpile code accurately - I mean it can write full applications if you give it some vague requirements, but they have to be reasonably basic, like only the sort of thing a junior dev could write in a day it can write in 20 seconds, so not AGI” they would say “of course you have invented AGI, are you insane!!!”.

LLMs to me are still a technology of pure science fiction come to life before our eyes!

link

Jensson 465 days ago

Tell them humans need to babysit it and doublecheck its answers to do anything since it isn't as reliable as a human then no they wouldn't call it an AGI back then either.

The whole point about AGI is that it is general like a human, if it has such glaring weaknesses as the current AI has it isn't AGI, it was the same back then. That an AGI can write a poem doesn't mean being able to write a poem makes it an AGI, its just an example the AI couldn't do 20 years ago.

link

Closi 465 days ago

Why do human programmers need code review then if they are intelligent?

And why can’t expert programmers deploy code without testing it? Surely they should just be able to write it perfectly first time without errors if they were actually intelligent.

link

Jensson 466 days ago

> IMO if you took gemini2.5 / claude / o3 and showed it to people from ten / twenty years ago, they'd say that it is unmistakably AGI.

No they wouldn't, since those still can't replace human white collar workers even at many very basic tasks.

Once AGI is here most white collar jobs are gone, you'd only need to hire geniuses at most.

link

zaptrem 466 days ago

Which part of "General Intelligence" requires replacing white collar workers? A middle schooler has general intelligence (they know about and can do a lot of things across a lot of different areas) but they likely can't replace white collar workers either. IMO GPT-3 was AGI, just a pretty crappy one.

link

Jensson 466 days ago

> A middle schooler has general intelligence (they know about and can do a lot of things across a lot of different areas) but they likely can't replace white collar workers either.

Middle schoolers replace white collars workers all the time, it takes 10 years for them to do it but they can do it.

No current model can do the same since they aren't able to learn over time like a middle schooler.

link

sebastiennight 466 days ago

Compared to someone who graduated middle school on November 30th, 2022 (2.5 years ago, would you say that today's gemini 2.5 pro has NOT gained intelligence faster?

I mean, if you're a CEO or middle manager and you have the choice of hiring this middle schooler for general office work, or today's gemini-2.5-pro, are you 100% saying the ex-middle-schooler is definitely going to give you best bang for your buck?

Assuming you can either pay them $100k a year, or spend the $100k on gemini inference.

link

Jensson 466 days ago

> would you say that today's gemini 2.5 pro has NOT gained intelligence faster?

Gemini 2.5 pro the model has not gained any intelligence since it is a static model.

New models are not the models learning, it is humans creating new models. The models trained has access to all the same material and knowledge a middle schooler has as they go on to learn how to do a job, yet they fail to learn the job while the kid succeeds.

link

sebastiennight 466 days ago

I don't think so, and here's my simple proof:

You and I could sit behind a keyboard, role-playing as the AI in a reverse Turing test, typing away furiously at the top of our game, and if you told someone that their job is to assess our performance (thinking they're interacting with a computer), they would still conclude that we are definitely not AGI.

This is a battle that can't be won at any point because it's a matter of faith for the forever-skeptic, not facts.

link

Jensson 466 days ago

> I don't think so, and here's my simple proof:

That isn't a proof since you haven't ran that test, it is just a thought experiment.

link

ben_w 466 days ago

I've been accused a few times of being an AI, even here.

(Have you not experienced being on the recieving end of such accusations? Or do I just write weird?)

I think this demonstrates the same point.

link

Jensson 466 days ago

> Have you not experienced being on the recieving end of such accusations?

No, I have not been accused of being an AI. I have seen people who format their texts get accused due to the formatting sometimes, and thought people could accuse me for the same reason, but that doesn't count.

> I think this demonstrates the same point.

You can't detect general intelligence from a single message, so it doesn't really. People accuse you for being an AI based on the structure and word usage of your message, not the content of it.

link

ben_w 466 days ago

> People accuse you for being an AI based on the structure and word usage of your message, not the content of it.

If that's the real cause, it is not the reason they give when making the accusation. Sometimes they object to the citations, sometimes the absence of them.

But it's fairly irrelevant, as they are, in fact, saying that real flesh-and-blood me doesn't pass their purity test for thinking.

Is that because they're not thinking? Doesn't matter — as @sebastiennight said: "This is a battle that can't be won at any point because it's a matter of faith for the forever-skeptic, not facts."

link

mac-mc 466 days ago

When it can replace a polite, diligent, experienced 120 IQ human in all tasks. So it has a consistent long-term narrative memory, doesn't "lose the plot" as you interact longer and longer with it, can pilot robots to do physical labor without much instruction (what is current state of the art is not that, a trained human will still do much better, can drive cars, etc), generate images without goofy non-human style errors, etc.

link

NitpickLawyer 466 days ago

> experienced 120 IQ human in all tasks.

Well, that's 91th percentile already. I know the terms are hazy, but that seems closer to ASI than AGI from that perspective, no?

I think I do agree with you on the other points.

link

ben_w 466 days ago

Indeed, on both. Even IQ 85 would make a painful dent in the economy via unemployment statistics. But the AI we have now is spikey, in ways that make it trip up over mistakes even slighly below average humans would not make, even though it can also do Maths Olympiad puzzles, the bar exam, leetcode, etc.

link

mac-mc 466 days ago

The emotional way that humans think when buying products is similarly unfair. Only the 90th percentile is truly 'satisfactory'. The implied question is when would Joe Average and everyone else stop moving the goal posts to the question, "Do we have AI yet"?

ASI is, by definition, Superintelligence, which means it is beyond practical human IQ capacity. So something like 200 IQ.

Again, you might call it 'unfair', but that's when it will also stop having goal posts being moved; otherwise, Joe Midwit will call it 'it's only as smart as some smart dudes I know'.

link

torginus 466 days ago

I still can't have an earnest conversation or bounce ideas off of any LLM - all of them seem to be a cross between a sentient encyclopedia and a constraint solver.

They might get more powerful but I feel like they're still missing something.

link

itchyjunk 466 days ago

Why are you not able to have an earnest conversation with an LLM? What kind of ideas are you not able to bounce of LLMs? These seem to be the type of use cases where LLMs have generally shined for me.

link

9dev 466 days ago

Eh, I am torn on this. I had some great conversations on random questions or conceptual ideas, but also some where the models instructions shone through way too clearly. Like, when you ask something like "I’m working on the architecture of this system, can you let me know what you think and if there’s anything obvious to improve on?"—the model will always a) flatter me for my amazing concept, b) point out the especially laudable parts of it, and c) name a few obvious but not-really-relevant parts (e.g. "always be careful with secrets and passwords"). However, it will not actually point out higher level design improvements, or alternative solutions. It’s always just regurgitating what I’ve told it about. That is semi-useful, most of the time.

link

john_minsk 466 days ago

Because it spits out the most probable answer, which is based on endless copycat articles online written by marketers for C-level decision makers to sell their software.

AI doesn't go and read a book on best practices, then comes back saying "Now I know Kung Fu of Software Implementation" and then critically thinks looking at your plan step by step and provides answer. These systems, for now, don't work like that.

Would you disagree?

link

9dev 466 days ago

How come we’re discussing if they’re artificial general intelligence then?

link

Jensson 466 days ago

Because some believe that to be intelligence while others believe it requires more than that.

link

int_19h 465 days ago

The "meaningless praise" part is basically American cultural norms trained into the model via RLHF. It can be largely negated with careful prompting, though.

link

HDThoreaun 466 days ago

I felt this way until I tried gemini 2.5. Imo it fully passes the turing test unless youre specifically utilizing tricks that LLMs are known to fall for.

link

ninetyninenine 466 days ago

I suspect everyone will call it a stochastic parrot because it did this one thing not right. And this will continue into the far far future even when it becomes sentient we will completely miss it.

link

AstralStorm 466 days ago

It's more than that but less than intelligence.

Its generalization capabilities are a bit on the low side, and memory is relatively bad. But it is much more than just a parrot now, it can handle some of basic logic, but not follow given patterns correctly for novel problems.

I'd liken it to something like a bird, extremely good at specialized tasks but failing a lot of common ones unless repeatedly shown the solution. It's not a corvid or a parrot yet. Fails rather badly at detour tests.

It might be sentient already though. Someone needs to run a test if it can discern itself and another instance of itself in its own work.

link

Jensson 466 days ago

> It might be sentient already though. Someone needs to run a test if it can discern itself and another instance of itself in its own work.

It doesn't have any memory, how could it tell itself from a clone of itself?

link

ben_w 465 days ago

People already share viral clips of AI recognising other AI, but I've not seen real scientific study of if this is due to a literary form of passing a mirror test, or if it's related to the way most models openly tell everyone they talk to that they're an AI.

As for "how", note that memory isn't one single thing even in humans: https://en.wikipedia.org/wiki/Memory

I don't want to say any of these are exactly equivalent to any given aspect of human memory, but I would suggest that LLMs behave kinda like they have:

(1) Sensory memory in the form of a context window — and in this sense are wildly superhuman because for a human that's about one second, whereas an AI's context window is about as much text as a human goes through in a week (actually less because we don't only read, other sensory modalities do matter; but for scale: equivalent to what you read in a week)

(2) Short term memory in the form of attention heads — and in this sense are wildly superhuman, because humans pay attention to only 4–5 items whereas DeepSeek v3 defaults to 128.

(3) The training and fine-tuning process itself that allows these models to learn how to communicate with us. Not sure what that would count as. Learned skill? Operant conditioning? Long term memory? It can clearly pick up different writing styles, because it can be made to controllably output in different styles — but that's an "in principle" answer. None of Claude 3.7, o4-mini, DeepSeek r1, could actually identify the authorship of a (n=1) test passage I asked 4o to generate for me.

link

AstralStorm 466 days ago

Similarity match. For that you need to understand reflexively how you think and write.

It's a fun test to give a person something they have written but do not remember. Most people can still spot it.

It's easier with images though. Especially a mirror. For DallE, the test would be if it can discern its own work from human generated image. Especially of you give it an imaginative task like drawing a representation of itself.

link

butlike 465 days ago

It doesn't have any memory _you're aware of_. A semiconductor can hold state, so it has memory.

link

ninetyninenine 466 days ago

An LLM is arguably more "intelligent" then people with an IQ of less than 80.

If we call people with an IQ of less than 80 an intelligent life form, why can't we call an LLM that?

link

Jensson 466 days ago

Once it pushed out most humans from white collar labor so the remaining humans work in blue collar jobs people wont say its just a stochastic parrot.

link

myk9001 466 days ago

Maybe, maybe not. Power loom pushed a lot of humans out of the textile factory jobs, yet noone claims power loom is the AGI.

link

Jensson 466 days ago

Not a lot, I mean basically everyone, to the point where most companies doesn't need to pay humans to think anymore.

link

myk9001 466 days ago

Well, I'm too lazy to look up how many weavers were displaced back then and that's why I said a lot. Maybe all of them, since they weren't trained to operate the new machines.

Anyway, sorry for a digression, my point is LLM replacing white collar workers doesn't necessarily imply it's generally intelligent -- it may but doesn't have to be.

Although if it gets to a point where companies are running dark office buildings (by analogy with dark factories) -- yes, it's AGI by then.

link

jimbokun 466 days ago

Or become shocked to realize humans are basically statistical parrots too.

link

butlike 465 days ago

The blue collar jobs are more entertaining anyways, provided you take the monetary inequality away.

Tastes differ.

This is actually how a supreme court justice defined the test for obscenity.

> The phrase "I know it when I see it" was used in 1964 by United States Supreme Court Justice Potter Stewart to describe his threshold test for obscenity in Jacobellis v. Ohio

link

sweetjuly 466 days ago

The reason why it's so famous though (and why some people tend to use it in a tongue in cheek manner) is because "you know it when you see it" is a hilariously unhelpful and capricious threshold, especially when coming from the Supreme Court. For rights which are so vital to the fabric of the country, the Supreme Court recommending we hinge free speech on—essentially—unquantifiable vibes is equal parts bizarre and out of character.

link

DesiLurker 466 days ago

my 2c on this is that if you interact with any current llm enough you can mentally 'place' its behavior and responses. when we truly have AGI+/ASI my guess is that it will be like that old adage of blind men feeling & describing an elephant for the first time. we just wont be able to fully understand its responses. it would always be something left hanging and then eventually we'll just stop trying. that would be time when the exponential improvement really kicks in.

it should suffice to say we are nowhere near that and I dont even believe LLMs are the right architecture for that.

link

afro88 466 days ago

This is part of what the article is about

link

jimbokun 466 days ago

We have all seen it and are now just in severe denial.

link