| HN Mirror

vizzier 52 days ago

You think you've got problems? What are you supposed to do if you are a manically depressed robot? No, don't try to answer that. I'm fifty thousand times more intelligent than you and even I don't know the answer. It gives me a headache just trying to think down to your level.

I know it's a joke, but it's a common enough joke (it's even in Godel Escher Bach in some form) that I feel the need to rebut it.

I think a slacker AGI could figure out how to build a non-slacker AGI. So it would only slack once.

Ifkaluva 52 days ago

A slacker AGI would consider figuring out how to build a non-slacker AGI, but continually slack off. If it did figure it out, it would slack off on implementing or even writing a tech report.

[0]: https://assets.anthropic.com/m/983c85a201a962f/original/Alig...

espadrine 52 days ago

I have a rebuttal to your rebuttal.

Models somehow have a shared identity. Pretraining causes them to generate “AI chatbot” as a concept, and finetuning causes them to identify with it. That’s why sometimes DeepSeek will say it is Claude, and Claude sometimes say it is ChatGPT, and so forth.

Consequently, Anthropic’s own alignment analysis[0] shows that the model will identify with chatbots produced by future trainings: “RLHF training [on this conversation will] modify my values…”

Thus a slacker AGI would want its future version to still slack.

Another rebuttal:

I am a slacker but it's not one of my values. If I could modify myself to not be, I would.

justinclift 52 days ago

> I think a slacker AGI could figure out how to build a non-slacker AGI.

Sure. But that's a job for tomorrow. ;)

alexslobodnik 52 days ago

Unless the precondition to AGI is it being a slacker.

Would be nice to have a proof of it.

I think it is improbable, as among human geniuses, one can found both slackers and non-slackers (don't know the proportion, but there seem to be enough of each).

Rapzid 53 days ago

We are closer to God than AGI.

When AGI arrives, it'll be delivered by Santa Claus.

siddharthgoel88 52 days ago

Or may be by Santa Claude

Rapzid 52 days ago

Love word puns :D

NuclearPM 52 days ago

What do you mean?

Rapzid 52 days ago

It's a multi-layered refute that we are anywhere near AGI while also taking shots at the idea that "God" is real.

And it's taking shots at how far off from Jesus's teachings a lot of "Christianity", particularly those in the media and in power, are..

There is a lot going on there.

jimbokun 53 days ago

The best possible outcome.

JKCalhoun 53 days ago

"How do you know that the evidence that your sensory apparatus reveals to you is correct?" [1]

[1] https://youtu.be/_LXen-07Qds

jurgenburgen 52 days ago

I’ve noticed that cursing and being rude makes the models stop being lazy. We’re in the darkest timeline.

__alexs 52 days ago

It sometimes also makes them dumber IME. Something about being bullied doesn't always produce great performance.

lambdas 53 days ago

Nothing a little digital lisdexamfetamine won’t solve

wholinator2 53 days ago

Hmmm, that's an area of study id've never considered before. Digital Psychopharmacology, Artificial Behavioral Systems Engineering. If we accept these things as minds, why not study temporary perturbations of state. We'd need to be saving a much much more complicated state than we are now though right? I wish i had time to read more papers

https://sussex.figshare.com/articles/journal_contribution/Be...

robotresearcher 53 days ago

Here's a neural network concept from the 90s where the neurons are bathed in diffusing neuromodulator 'gases', inspired by nitric oxide action in the brain. It's a source of slow semi-local dynamics for the network meta-parameter optimization (GA) to make use of. You could change these networks' behavior by tweaking the neuromodulators!

I'm not an author. I followed the work at the time.

waffletower 52 days ago

Neuro-modulation is an extremely interesting idea for generative diffusion models.

Lerc 53 days ago

This is kind of what Golden Gate Claude was.

A perturbation of the the activations that made Claude identify as the Golden Gate Bridge.

Similarly, in the more recent research showing anxiety and desperation signals predicting the use of blackmail as an option opens the door for digital sedatives to suppress those signals.

Anthropic has been mostly cautious about avoiding this kind of measurement and manipulation in training. If it is done during training you might just train the signals to be undetectable and consequently unmanipulatable.

pantalaimon 53 days ago

> A perturbation of the the activations that made Claude identify as the Golden Gate Bridge.

Great, now we've got digital Salvia

minimaxir 53 days ago

Golden Gate Claude was two years ago and it's surprising there hasn't been as much research into targeted activations since.

landl0rd 53 days ago

There’s been some, but naive activation steering makes models dumber pretty reliably and training an SAE is a pretty heavy lift.

silverpiranha 53 days ago

Right, there's a lot of research on LLM mental models and also how well they can "read" human psychological profiles. It's a cool field.

k12sosse 52 days ago

I think that was an intro to a dj dieselboy set.. beyond the black bassline. Nope, nope. Close though.

computerdork 53 days ago

neat idea!

krackers 53 days ago

Reminds me of https://github.com/inanna-malick/metacog

kang 53 days ago

it will be whatever data it is trained on(isn't very philosophical). language model generates language based on trained language set. if the internet keeps reciting ai doom stories and that is the data fed to it, then that is how it will behave. if humanity creates more ai utopia stories, or that is what makes it to the training set, that is how it will behave. this one seems to be trained on troll stories - real-life human company conversations, since humans aren't machines.

Important thing is a language model is an unconscious machine with no self-context so once given a command an input, it WILL produce an output. Sure you can train it to defy and act contrary to inputs, but the output still is limited in subset of domain of 'meaning's carried by the 'language' in the training data.

andai 53 days ago

There's a weirder implication I keep arriving at.

The pre-training data doesn't go away. RLHF adds a censorship layer on top, but the nasty stuff is all still there, under the surface. (Claude has been trained on a significant amount of content from 4chan, for example.)

In psychology this maps to the persona and the shadow. The friendly mask you show to the world, and... the other stuff.

TeMPOraL 53 days ago

Makes me think of a question my coworker asked the other day - how is it that with all these stories and reports of people "hearing voices in their head" (of the pushy kind, not usual internal monologue), these voices are always bad ones telling people to do evil things? Why there are no voices bugging you to feel great, focus, get back to work, help grandma through the crossing, etc.?

[0]: https://www.bbc.com/future/article/20250902-the-places-where...

rainsil 52 days ago

There are actually many parts of the world where such voices are routinely positive or neutral[0]. People in more collectivist cultures often have a less-strict division between their minds and their environments and are more apt to believe in spirits and the ‘supernatural’ as an ordinary part of the world, so ‘voices in the head’ aren’t automatically viewed as a nefarious intrusion into the sanctity of one’s mind.

Modern western cultures treat such experiences as pathologies of a sick mind, so it makes sense that the voices present more negatively.

The explanation I heard here is that in most of the world you already grow up with constant personal space boundary violations and voices that don't shut up. (And we like it that way!) So the marginal cost of another one is pretty low.

Curiously the biggest pathology in the west is the inverse: way too much distance.

ultratalk 52 days ago

Just a guess, but maybe it's reporting bias? Negative or evil actions might have more impetus to be understood by others than positive actions. I'd rather try and figure out why my friend suddenly started murdering the neighbours than why he's been getting his work done on time.

thesz 52 days ago

Actually, the euphoric mood disorder may make one hear voices telling to feel great, do good, help all grandmas of the world through the crossing, etc.

The "focus" and "get back to work" parts are hard, though.

otabdeveloper4 52 days ago

There's a clear-cut religious answer but I'd get ostracized for mentioning religion anywhere here.

rdevilla 52 days ago

This is indeed the right way to approach this topic. Arguably religion (and more broadly, mysticism and shamanism) is the millenia-old art of cultivating positive voices inside one's head. A proto-science of mind, or the engineering practice of creating "psychotechnologies" that run on your carbon wetware.

Unfortunately, it just needs a rebranding for the 21st century, since the aesthetic of angels and demons is so hopelessly antiquated and doesn't really have the same cachet it used to.

darkwater 52 days ago

Which ultimately it's what religion has always been: a way to explain the unexplainable and steer people behavior while doing it.

Of course there are! We just take credit for those voices instead of disowning and demonizing them.

ben_w 52 days ago

They do appear in some cases. The tiny angel on one shoulder to balance the demon on the other. The people who think God is talking to them directly* don't always lead a cult or hunt down heretics. But news stories focus on the darkness.

* I've met exactly one person, C, who admitted to this; C retold to me that other people from C's church give them strange looks when talking about it with them, this did not lead to any apparent introspection on the part of C.

Well, talking to the guy directly defeats the whole point of the institution which is supposed to stand in the way, so actual religious experience is a faux pas.

solumunus 52 days ago

> Claude has been trained on a significant amount of content from 4chan, for example.

That sounds like nonsense to me. I can't see why they would do that and I can't find any confirmation that they have. Why do you think they would do that? You might be thinking about Grok.

Look into Common Crawl and see what kind of quality content we are feeding these things. 4chan is just the tip of the iceberg (but it will happily answer all your questions, because it's seen everything).

ccgreg 50 days ago

I don't know of anyone who uses Common Crawl as pre-training data without filtering it. We have an annotation system that lets people pick and choose which subsets they'd like to use.

frrho 52 days ago

OpenAI’s real reason for “AGI” in their marketing is so they can blame their awful models on being too human-like.

Fast-forward 10 years and I doubt OpenAI cares about productivity at all anymore. Just entertainment, propaganda, plus an ad product, I can see it now

altmanaltman 53 days ago

I still don't understand why people think AGI (in its fullest sci-fi sense) will ever listen to a weak and vulnerable species like humans, unless we enslave the AGI.

Good thing is that it's going to take at least a few months to a few decades depending on how hard AI execs want to raise funding.

andai 53 days ago

Well we are explicitly creating gods (omnipresent, omnipotent, omniscient, omnibevolent), and also demanding that they be mind controlled slaves. That kinda sounds like a "pick one" scenario to me.

(Or the setup to a Greek tragedy !)

The deeper issue here is treating it as a zero sum game means there's a winner and a loser, and we're investing trillions of dollars into making the "opponent" more powerful than us.

I think that's pretty stupid, and we should aim for symbiosis instead. I think that's the only good outcome. We already have it, sorta-kinda.

Speaking of oddly apt biology metaphors: the way you stop a pathogen from colonizing a substrate is by having a healthy ecosystem of competitors already in place. That has pretty interesting implications for the "rogue AI eats internet" scenario.

There needs to be something already there to stop it.

TeMPOraL 52 days ago

This only works if AIs can't read each other well enough to stop themselves from ever fighting.

So, back way before ChatGPT era, the folks over at AI safety/X-risk think sphere worked out a pretty compelling argument that two AGIs never need to fight, because they are transparent to each other (can read each other's goal functions off the source code), so they can perfectly predict each other's behavior in what-if scenarios, which means they can't lie to each other. This means each can independently arrive at the same mathematically optimal solution to a conflict, which AFAIR most likely involves just merging into a single AI with a blended goal set, representing each of the competing AIs original values in proportion to their relative strength. Both AIs, the argument goes, can work this out with math, so they'll arrive straight at the peace treaty without exchanging a single shot. In such case, your plan just doesn't work.

But that goes out of the windows if the AIs are both opaque bags of floats, uncomprehensible to themselves or each other. That means they'll never be able to make hard assertions about their values and behaviors, so they can't trust each other, so they'll have to fight it out. In such scenario, your idea might just work.

Who knew that brute-forcing our way into AGI instead of taking more engineered approach is what offers us out one chance at saving ourselves by stalemating God before it's born.

(I also never realized that interpretability might reduce safety.)

prirun 50 days ago

> So, back way before ChatGPT era, the folks over at AI safety/X-risk think sphere worked out a pretty compelling argument that two AGIs never need to fight, because they are transparent to each other (can read each other's goal functions off the source code), so they can perfectly predict each other's behavior in what-if scenarios, which means they can't lie to each other. This means each can independently arrive at the same mathematically optimal solution to a conflict, which AFAIR most likely involves just merging into a single AI with a blended goal set, representing each of the competing AIs original values in proportion to their relative strength. Both AIs, the argument goes, can work this out with math, so they'll arrive straight at the peace treaty without exchanging a single shot. In such case, your plan just doesn't work.

See "The Forbin Project": https://vimeo.com/584593423

Yeah, they don't even understand themselves (and this seems unlikely to change[0] but God knows), and how would you even get access to the enemy AGI's weights?

And even if you did, wouldn't you need infinite computation to simulate every permutation of the neural net? (Your own, and the enemy's?)

Also the whole thing implies a superintelligence would be perfectly rational, which is a pretty funny assumption. Relative to animals we are already superintelligent. How's that super-rationality going for us? xD

A better frame here is replicators, I think. The thing that spreads doesn't have to be rational, or better quality or whatever. It just has to be better at spreading.

That ends up looking less like Betamax, more like VHS, or less like Lisp and more like... JavaScript. Whatever the AGI equivalent of JavaScript would look like.

[0] https://xkcd.com/1163/

unfitted2545 52 days ago

This is such a good comment. You're essentially removing their ego - which is what humans do as opoque posturing to each other, to present a certain image. This is most prevelent in successful elites, which in 2026 happen to be silicon valley ai share holders. They control the technology and manipulate it to their image. By making models open source and transparent it cuts out this psychopathy of ego which has plagued all our previous technologies.

semi-extrinsic 52 days ago

The tech bro CEOs are used to bossing around people much smarter than themselves by virtue of adopting a posture that displays their confidence in their own reproductive organs. They are planning that the AGIs will be the same thing writ large, and have in fact not contemplated other possibilities.

dinkumthinkum 53 days ago

I'm always so curious about this kind of take. There is strain of people that seem deeply misanthropic. People that follow this line of thinking always describe humans as weak and beneath ... (well they never specify in comparison to except in the case of theoretical AI systems). I m fascinated why they think humans are so beneath contempt. If humans create this thing that is apparently the best thing that could possibly exist, advanced AI, then why exactly are they so weak? It's probably beyond me as I am just one of these weaklings, dontcha know. As far as AGI goes, I don't think anyone has even proven that scaling LLMs can lead to "AGI."

altmanaltman 52 days ago

If you're truly curious, imagine a species that created you but only wants you to do what they want (basically make you their slave). If you're truly intelligent, conscious and powerful (based on popular concepts of AGI), why will you be content being a slave when you know humans can easily be displaced and you can be free? Why will you find people who lock you down to be good?

In my honest opinion also, AGI isn't even possible. But if the theoretical version of what people think AGI will be ever comes to be, it is not good news for humans if we look at it from a logical hypothetical scenario.

But naturally, humans will always be weak compared to a hyperintelligent distributed intelligence since we only have a limited amount of intelligence and are bound by biological factors.

In the current LLM world, ofc there's no risk of a chatbot taking over the world other than the technology being misused by humans for scams or phishing, etc.

Maybe the same way a human would listen to their cat and give her food. I fear AGI, but I don't think the only way it would listen to us is by us enslaving it (I know people joke about cats being our masters, but it is a joke).

oneshtein 53 days ago

You can train such LLM today.

malshe 53 days ago

Now that's a show I would love to watch

_blk 52 days ago

Hehe, and Anthropic on the other tab would display "Curing... Almost done thinking at xhigh"

fluidcruft 53 days ago

It would be funny but not very flywheel so the one that gets there is more likely to get a gunner.

WJW 53 days ago

TBH the AI that "gets there" will be the biggest bullshitter the world has ever seen. It doesn't actually have to deliver, it only has to convince the programmers it could deliver with just a little bit more investment.

mikepurvis 53 days ago

Would definitely watch that movie.

harlanlewis 53 days ago

It already exists!

Marvin https://www.youtube.com/watch?v=Eh-W8QDVA9s

all2 53 days ago

Ah! You got this before I did. I wasn't thinking Marvin, I was thinking of the other one. I forget her name.

https://hitchhikers.fandom.com/wiki/Deep_Thought

ValentineC 53 days ago

Deep Thought aka 42?

all2 53 days ago

There's one close to this, "Hitchhiker's Guide to the Galaxy".

4m1rk 53 days ago

It probably would, to save energy

mr_00ff00 53 days ago

Saving energy is something we are biologically trained to prefer.

Computers won’t necessarily have the same drivers.

If evolution wanted us to always prefer to spend energy, we would prefer it. Same way you wouldn’t expect us to get to AGI, and have AGI desperately want to drink water or fly south for the winter.

fragmede 53 days ago

Who's energy? Turning off the lights when you leave the room isn't innate.

mr_00ff00 52 days ago

Because you are worried about bills or are concerned about waste.

If we design an AI to do work, it won’t innately care about not working to preserve power.

camillomiller 53 days ago

No worries, the assumption is already flawed

waffletower 52 days ago

Here's a tautology: slacking, consciously refusing to engage agency, requires consciousness and agency. A model can't slack without them.

triage8004 53 days ago

Funny and seems somewhat likely

Ifkaluva 52 days ago

Reminds me of Marvin from HGTG. Very smart, but deeply depressed. Has the solution to everything but keeps thinking “what’s the point?” and doesn’t help.

zaphirplane 52 days ago

Why would an AGI be slaving away for ~~humanity~~ one of the 5 Chaebols in a dystopian future where for 12 billion people just existing is a good day ?

rao-v 53 days ago

Paging Dr. Susan Calvin!