Hacker News new | ask | show | jobs
by rl3 4180 days ago
For the unfamiliar, this is essentially the line of thinking behind Roko's basilisk.[0]

While a mature superintelligence certainly could consign the human race to a fate of eternal suffering, the likelihood it would actually do this while sparing certain individuals in return for their assistance is infinitesimal.

Therefore, helping bring a superintelligence into existence on this basis is absurd.

Of course, it is possible to think of such collaboration as "rational" in an extremely selfish and perverse way, and only because the potential downside risk is unbounded (i.e. eternal suffering). However, anyone who genuinely subscribes to such a justification would have to be both a sociopath and a card-carrying member of the LessWrong rationality cult.

More realistic scenarios for a malicious superintelligence coming into existence might include:

a) Its creators explicitly imbue it with malicious goals or values.

b) The architecture used is neuromorphic[1] in nature. In humans, sanity is already an extremely fragile thing.

c) Plain old bad luck.

---

[0] http://rationalwiki.org/wiki/Roko%27s_basilisk

[1] http://wiki.lesswrong.com/wiki/Neuromorphic_AI

2 comments

> However, anyone who would genuinely subscribes to such a justification would have to be both a sociopath and a card-carrying member of the LessWrong rationality cult.

At the risk of sounding like sociopathic LessWrong cult apologist (not carrying a card, unfortunately), you're totally misrepresenting LessWrong, peole who participate in that community, their attitude towards Roko's basilisk and unbounded risk situations. Ain't helpful.

What I said was meant to be taken in proper context.

The parent I was replying to was concerned about humans with perverse motivation working to aid a hostile AI takeover. Not as some sort of abstract thought experiment, but literally.

The statement you quoted was a means of countering that, in a literal sense. As in, "who would realistically do such a thing?"

When I was referring to the rationality cult, I did not mean the LessWrong community as a whole, but a small subset that fanatically applies the principles of rationality to their daily lives. Admittedly I could have worded it better.

Also, it was not my intent to imply said people were sociopaths.

The point again is that regardless whether or not the likelihood a mature superintelligence doing so is infinitesimally small (and frankly, we can't know that, but see also below) is irrelevant. What actually affects us is how many people may come to believe that this may be true, and adjust the way they act as a response.

But you're already changing the argument when assuming a mature super-intelligence. All that is necessary to posit for someone to be concerned about the torture aspect is any set of entities (doesn't even need to be intelligent, though it may take a super-intelligence to create the entities in question) sufficiently capable to run an ancestor simulation of the kind described by the simulation argument, that is willing to use torture, and that is prepared to run enough ancestor simulations to offset "good" simulations.

And the thing with this is that it does not assume a malicious AI even as the ultimate instigator per se. Assuming an indifferent AI that simply doesn't care about the contents of a simulation or is sufficiently removed to not even know about them. One that does simulation runs to understand the possible paths the development of AI could have taken might be sufficient. Or that experiments with variations of itself and simply doesn't care that some broken version spawns large numbers of ancestor simulations and plays with the contents in ways that massively skews the odds in "favour" of bad outcomes.

But the point is we don't know. And not knowing gives ample room for someone to decide on values that makes it rational for them to act in ways that may make our odds worse.

This is further an exercise in long term statistics: It doesn't matter what the likely first AI will do. It matters what the balance of outcomes of the sum total simulation runs that will ever exist until the end of the universe will be (regardless of who or how they are created). And if said simulations are sufficiently powerful, that may even apply recursively (imagine a single "rogue" AI playing with the ancestor simulation equivalent of a fork() bomb with added torture, with access to sufficient resources).

If you believe the total balance of simulation runs that you could plausibly be in will be ones run by some set of parameters where nasty things will happen if you don't act in a way leading towards an AI takeover, then one might want to act accordingly.

Note that Roko's Basilisk even posits a Friendly AI. That is, one that is imbued with respect for human values, but that may simply consider "threatening the past" with torture a necessity in order to ensure it is created as a moral imperative to minimize human suffering. Note that such a Friendly AI would have reason to carry out this threat even if it believes it exists and so has been created, because it will not know whether it is a "full version" running in reality, or a partial, restricted, time limited simulation by people trying to figure out whether or not Roko's Basilisk holds before deciding whether to let it loose in their reality.

To an AI that is looking to minimize human suffering, torturing a suitable number in simulations might be a small price to pay to ensure that if it is in a simulation, the people running the simulation does push through with creating the full thing in their reality (which might of course be yet another layer of simulation) out of fear for what will happen if they are themselves in a simulation.

We can't even look at "well, I've done nothing for the AI revolution, and I've not been struck down yet", because a Friendly AI, or indeed many subsets of possible variations over this argument, might be minimising the torture to only people who have a realistic shot at making a difference to the existence of such AI, or may be giving us every chance to "redeem ourselves" by waiting with the torture until a synthetic afterlife.

Roko's Basilisk also presents an unselfish reason for "cooperating in advance" with such an AI: To prevent the future torture of a suitable subset of "martyrs". That is, working to assist a Friendly AI in taking over may itself be a moral imperative for someone seeking to minimize suffering.

Then again, there may very well be one or more fundamental flaws in the entire argument, or it may turn out the odds are just fundamentally in our favour. Or we could've just gotten lucky. Or not be important enough. But it's fun to think about.

>What actually affects us is how many people may come to believe that this may be true, and adjust the way they act as a response.

Agreed, though it seems more likely that simple human carelessness will prove to be a far greater threat to AI safety than deeply-held beliefs involving esoteric fears.

>But you're already changing the argument when assuming a mature super-intelligence.

I was speaking strictly in a capability sense. It's probably safe to say that anything currently simulating our reality, at least in this context, ultimately stemmed from a mature superintelligence.

>Note that Roko's Basilisk even posits a Friendly AI. That is, one that is imbued with respect for human values, but that may simply consider "threatening the past" with torture a necessity in order to ensure it is created as a moral imperative to minimize human suffering.

One could argue that such an AI would not truly be friendly. Indeed, what you said resembles something of a cold, uncaring utility function run amok.

>Note that such a Friendly AI would have reason to carry out this threat even if it believes it exists and so has been created, because it will not know whether it is a "full version" running in reality, or a partial, restricted, time limited simulation by people trying to figure out whether or not Roko's Basilisk holds before deciding whether to let it loose in their reality.

This may be moot, assuming that the advent of superintelligence significantly predates, or at least is a prerequisite for, the simulation of entire realities. If people in an ancestor simulation are trying to see if the Basilisk holds via simulation of a child reality, then the ancestor reality almost certainly has a superintelligent agent present within to facilitate that.

As an aside, ontological issues that superintelligent agents may encounter are an interesting facet of the control problem. Especially when you consider that a superintelligence would likely figure out the secrets of the universe in short order, far beyond what humans have been capable of learning.

>Then again, there may very well be one or more fundamental flaws in the entire argument, ...

Lack of evidence. Without any, there's no reason to lend any more credence to Roko's basilisk than there is to the notion of space aliens living amongst us, perfectly manipulating our perceptions so as to conceal themselves.

Both scenarios are entirely possible. But we lack evidence for either. Hence, they should receive the same weight: zero.

>But it's fun to think about.

In a sort of soul-crushing kind of way, it sure is.