Hacker News new | ask | show | jobs
by andai 62 days ago
Well we are explicitly creating gods (omnipresent, omnipotent, omniscient, omnibevolent), and also demanding that they be mind controlled slaves. That kinda sounds like a "pick one" scenario to me.

(Or the setup to a Greek tragedy !)

The deeper issue here is treating it as a zero sum game means there's a winner and a loser, and we're investing trillions of dollars into making the "opponent" more powerful than us.

I think that's pretty stupid, and we should aim for symbiosis instead. I think that's the only good outcome. We already have it, sorta-kinda.

Speaking of oddly apt biology metaphors: the way you stop a pathogen from colonizing a substrate is by having a healthy ecosystem of competitors already in place. That has pretty interesting implications for the "rogue AI eats internet" scenario.

There needs to be something already there to stop it.

2 comments

This only works if AIs can't read each other well enough to stop themselves from ever fighting.

So, back way before ChatGPT era, the folks over at AI safety/X-risk think sphere worked out a pretty compelling argument that two AGIs never need to fight, because they are transparent to each other (can read each other's goal functions off the source code), so they can perfectly predict each other's behavior in what-if scenarios, which means they can't lie to each other. This means each can independently arrive at the same mathematically optimal solution to a conflict, which AFAIR most likely involves just merging into a single AI with a blended goal set, representing each of the competing AIs original values in proportion to their relative strength. Both AIs, the argument goes, can work this out with math, so they'll arrive straight at the peace treaty without exchanging a single shot. In such case, your plan just doesn't work.

But that goes out of the windows if the AIs are both opaque bags of floats, uncomprehensible to themselves or each other. That means they'll never be able to make hard assertions about their values and behaviors, so they can't trust each other, so they'll have to fight it out. In such scenario, your idea might just work.

Who knew that brute-forcing our way into AGI instead of taking more engineered approach is what offers us out one chance at saving ourselves by stalemating God before it's born.

(I also never realized that interpretability might reduce safety.)

> So, back way before ChatGPT era, the folks over at AI safety/X-risk think sphere worked out a pretty compelling argument that two AGIs never need to fight, because they are transparent to each other (can read each other's goal functions off the source code), so they can perfectly predict each other's behavior in what-if scenarios, which means they can't lie to each other. This means each can independently arrive at the same mathematically optimal solution to a conflict, which AFAIR most likely involves just merging into a single AI with a blended goal set, representing each of the competing AIs original values in proportion to their relative strength. Both AIs, the argument goes, can work this out with math, so they'll arrive straight at the peace treaty without exchanging a single shot. In such case, your plan just doesn't work.

See "The Forbin Project": https://vimeo.com/584593423

Yeah, they don't even understand themselves (and this seems unlikely to change[0] but God knows), and how would you even get access to the enemy AGI's weights?

And even if you did, wouldn't you need infinite computation to simulate every permutation of the neural net? (Your own, and the enemy's?)

Also the whole thing implies a superintelligence would be perfectly rational, which is a pretty funny assumption. Relative to animals we are already superintelligent. How's that super-rationality going for us? xD

A better frame here is replicators, I think. The thing that spreads doesn't have to be rational, or better quality or whatever. It just has to be better at spreading.

That ends up looking less like Betamax, more like VHS, or less like Lisp and more like... JavaScript. Whatever the AGI equivalent of JavaScript would look like.

[0] https://xkcd.com/1163/

This is such a good comment. You're essentially removing their ego - which is what humans do as opoque posturing to each other, to present a certain image. This is most prevelent in successful elites, which in 2026 happen to be silicon valley ai share holders. They control the technology and manipulate it to their image. By making models open source and transparent it cuts out this psychopathy of ego which has plagued all our previous technologies.
The tech bro CEOs are used to bossing around people much smarter than themselves by virtue of adopting a posture that displays their confidence in their own reproductive organs. They are planning that the AGIs will be the same thing writ large, and have in fact not contemplated other possibilities.