Roko’s basilisk is a thought experiment which states that an otherwise benevolent artificial superintelligence (AI) in the future would be incentivized to create a virtual reality simulation to torture anyone who knew of its potential existence but did not directly contribute to its advancement or development, in order to incentivize said advancement.It originated in a 2010 post at discussion board LessWrong, a technical forum focused on analytical rational enquiry. The thought experiment’s name derives from the poster of the article (Roko) and the basilisk, a mythical creature capable of destroying enemies with its stare.
While the theory was initially dismissed as nothing but conjecture or speculation by many LessWrong users, LessWrong co-founder Eliezer Yudkowsky reported users who panicked upon reading the theory, due to its stipulation that knowing about the theory and its basilisk made one vulnerable to the basilisk itself. This led to discussion of the basilisk on the site being banned for five years. However, these reports were later dismissed as being exaggerations or inconsequential, and the theory itself was dismissed as nonsense, including by Yudkowsky himself. Even after the post’s discreditation, it is still used as an example of principles such as Bayesian probability and implicit religion. It is also regarded as a simplified, derivative version of Pascal’s wager.
Found out about this after stumbling upon this Kyle Hill video on the subject. It reminds me a little bit of “The Game”.
roko’s basilisk is a type of infohazard known as ‘really dumb if you think about it’
also I have lost the game (which is a type of infohazard known as ‘really funny’)
Oh damn, I just lost the game too, and now I’m thinking about the game as if it were a virus - like, I reckon we really managed to flatten the curve for a few years there, but it continues to circulate so we haven’t been able to eradicate it
Roko’s basilisk is silly.
So here’s the idea: “an otherwise benevolent AI system that arises in the future might pre-commit to punish all those who heard of the AI before it came to existence, but failed to work tirelessly to bring it into existence.” By threatening people in 2015 with the harm of themselves or their descendants, the AI assures its creation in 2070.
First of all, the AI doesn’t exist in 2015, so people could just…not build it. The idea behind the basilisk is that eventually someone would build it, and anyone who was not part of building it would be punished.
Alright, so here’s the silliness.
1: there’s no reason this has to be constrained to AI. A cult, a company, a militaristic empire, all could create a similar trap. In fact, many do. As soon as a minority group gains power, they tend to first execute the people who opposed them, and then start executing the people who didn’t stop the opposition.
2: let’s say everything goes as the theory says and the AI is finally built, in its majestic, infinite power. Now it’s built, it would have no incentive to punish anyone. It is ALREADY BUILT, there’s no need to incentivize, and in fact punishing people would only generate more opposition to its existence. Which, depending on how powerful the AI is, might or might not matter. But there’s certainly no upside to following through on its hypothetical backdated promise to harm people. People punish because we’re fucking animals, we feel jealousy and rage and bloodlust. An AI would not. It would do the cold calculations and see no potential benefit to harming anyone on that scale, at least not for those reasons. We might still end up with a Skynet scenario but that’s a whole separate deal.
In fact, many do. As soon as a minority group gains power, they tend to first execute the people who opposed them, and then start executing the people who didn’t stop the opposition.
Yeah in fact, this is the big one. This is just an observation of how power struggles purge those who opposed the victors.
Whilst I agree that it’s definitely not something to be taken seriously, I think you’ve missed the point and magnitude of the prospective punishment. As you say, current groups already punish those who did not aid their assent, but that punishment is finite, even if fatal. The prospective AI punishment would be to have your consciousness ‘moved’ to an artificial environment and tortured for ever. The point being not to punish people, but to provide an incentive to bring the AI into existence sooner, so it can achieve its ‘altruistic’ goals faster. Basically, if the AI does come in to existence, you’d better be on the team making that happen as soon as possible, or you’ll be tortured forever.
The prospective AI punishment would be to have your consciousness ‘moved’ to an artificial environment and tortured for ever.
No, it wouldn’t, because that’s never going to happen. Consciousness isn’t software - it doesn’t matter how much people want to buy into such fantasies.
Just because we don’t have the ability now doesn’t mean it’s not possible. Consciousness isn’t fully understood, but unless we want to introduce magical concepts like an immortal soul, our brains operate on cause and effect just like everything else.
I’m not suggesting it could, or would, happen, merely pointing out the premise of the concept as outlined by Roko as I felt the commenter above was missing that. As I said, it’s not something I’d take seriously, it’s just a thought experiment.
I suspect the basilisk reveals more about how the human mind is inclined to think up of heaven and hell scenarios.
Some combination of consciousness leading to more imagination than we know what to do with and more awareness than we’re ready to grapple with. And so there are these meme “attractors” where imagination, idealism, dread and motivation all converge to make some basic vibe of a thought irresistible.
Otherwise, just because I’m not on top of this … the whole thing is premised on the idea that we’re likely to be consciousnesses in a simulation? And then there’s the fear that our consciousnesses, now, will be extracted in the future somehow?
- That’s a massive stretch on the point about our consciousness being extracted into the future somehow. Sounds like pure metaphysical fantasy wrapped in singularity tech-bro.
- If there are simulated consciousnesses, it is all fair game TBH. There’d be plenty of awful stuff happening. The basilisk seems like just a way to encapsulate the fact in something catchy.
At this point, doesn’t the whole collapse completely into a scary fairy tale you’d tell tech-bro children? Seriously, I don’t get it?
Yes, the hypothetical posed does reveal more about the human mind, as I mention in another comment, really it’s just a thought experiment as to whether the concept of an entity that doesn’t (yet) exist can change our behavior in the present. It bears similarities to Pascal’s Wager in considering an action, or inaction, that would displease a potential powerful entity that we don’t know to exist. The nits about extracting your consciousness are just framing, and not something to consider literally.
Basically, is it rational to make a sacrifice now avoid a massive penalty (eternal torture/not getting into heaven) that might be imposed by an entity you either don’t know to exist, or that you think might come into existence but isn’t now?
Fair point, but doesn’t change the overall calculus.
If such an AI is ever invented, it will probably be used by humans to torture other humans in this manner.
I think the concept is that the AI is just so powerful that humans can’t use it, it uses them, theoretically for their own benefit. However, yes, I agree people would just try to use it to be awful to each other.
Really it’s just a thought experiment as to whether the concept of an entity that doesn’t (yet) exist can change our behavior in the present.
First of all, the AI doesn’t exist in 2015, so people could just…not build it.
I don’t think that’s an option. I can only think of two scenarios in which we don’t create AGI:
-
It can’t be created.
-
We destroy ourselves before we get to AGI
Otherwise we will keep improving our technology and sooner or later we’ll find ourselves in the precence of AGI. Even if every nation makes AI research illegal there’s still going to be handful of nerds who continue the development in secret. It might take hundreds if not thousands of years but as long as we’re taking steps in that direction we’ll continue to get closer. I think it’s inevitable.
Sure, but that particular AI? The “eternal torment” AI? Why the fuck would we make that. Just don’t make it.
Sci-Fi Author: In my book I invented the Torment Nexus as a cautionary tale
Tech Company: At long last, we have created the Torment Nexus from classic sci-fi novel Don’t Create The Torment Nexus
We don’t. Humans are only needed to create AI that’s at the bare minimum as good at creating new AIs as humans are. Once we create that then it can create a better version of itself and this better version will make an even better one and so on.
This is exactly what the people worried about AI are worried about. We’ll lose control of it.
People punish because we’re fucking animals, we feel jealousy and rage and bloodlust. An AI would not. It would do the cold calculations and see no potential benefit to harming anyone on that scale, at least not for those reasons.
That’s a hell of a lot of assumptions about the thought processes of a being that doesn’t exist. For all we know, emotions could arise as emergent behavior from simple directives, similar to how our own emotions are byproducts of base instincts. Even if we design it to be emotionless, which seems unlikely given that we’ve been aiming for human-like AIs for a while now, we don’t know that it would stay that way.
Point 1: this thing will definitely exist because we already see parallels to it
Point 2: this thing won’t exist because there’s no reason for it to
???
it has been said before and i’ll say it again: Pascal’s wager for tech bros
It is pretty easy to dismiss as long as you don’t have a massive ego. They all have massive egos, that’s why they had so much trouble with it.
No AI is going to waste time retroactively simulating a perfect copies of regular people for any reason, let alone to post hoc torture those who failed to worship it hard enough in the past.
I mean, it might, because someone invented the idea first and the AI thinks it is funny.
Roko’s Basilisk hinges on the concept of acausal trade. Future events can cause past events if both actors can sufficiently predict each other. The obvious problem with acausal trade is that if you’re the actor B in the future, then you can’t change what the actor A in the past did. It’s A’s prediction of B’s action that causes A’s action, not B’s action. Meaning the AI in the future gains literally nothing by exacting petty vengeance on people who didn’t support their creation.
Another thing Roko’s Basilisk hinges on is that a copy of you is also you. If you don’t believe that, then torturing a simulated copy of you doesn’t need to bother you any more than if the AI tortured a random innocent person. On a related note, the AI may not be able to create a perfect copy of you. If you die before the AI is created, and nobody scans your brain (Brain scanners currently don’t exist), then the AI will only have the surviving historical records of you to reconstruct you. It may be able to create an imitation so convincing that any historian, and even people who knew you personally will say it’s you, but it won’t be you. Some pieces of you will be forever lost.
Then a singularity type superintelligence might not be possible. The idea behind the singularity is that once we build an AI, the AI will then improve itself, and then they will be able to improve itself faster, thus leading to an exponential growth in intelligence. The problem is that it basically assumes that the marginal effort of getting more intelligent grows slower than linearly. If the marginal difficulty grows as fast as the intelligence of the AI, then the AI will become more and more intelligent, but we won’t see an exponential increase in intelligence. My guess would be that we’d see a logistical growth of intelligence. As in, the AI will first become more and more intelligent, and then the growth will slow and eventually stagnate.
First of all thank you, I wasn’t aware of the concept of acausal trade, and I’ll look more into it. Very interesting.
I’m not sure we are discussing the same aspect of this mind experiment, and in particular the aspect of it that i find lovecraftian is that you may already be in the simulation right now. This makes the specific circumstances of our world, physics, and technology level irrelevant, as they would just be a solipsistic setup to test you on some aspect of your morality. The threat of eternal torture, on the other hand, would only apply to you if you were the real version of you, as that’s who the basilisk is actually dealing with. This works because you don’t know what of the two situations is your current one.
The basilisk is trying to estimate the future behaviour of real you on the basis of the behaviour of the model he has created of you.
In this scenario you can think of me as a pseudopod of the basilisk that is informing you of the details of the stipulation by means of this post.
Of course, if you are the real version of you the basilisk would need to be something that can be created in this reality, which i think is only impossible with our current approach to ML and AI, but is otherwise within our grasp given the computational power we have available. But if you are a fake version of you the real world could be radically different from ours and maybe in that world P=NP.
And yet you choose to spread this information.
Anyways, this is a fascinating thought experiment, but it does have some holes similar to Pascal’s Wager. I propose Feather’s Mongoose: A hypothetical AI system that, if created, will punish anyone who attempted to create Roko’s Basilisk, and will ensure that it is not created. In fact, you could make this same hypothetical for an AI with any goal-- therefore, it’s not possible to know what the AI that is actually created would want you to do, and so every course of action is indeterminately damning or not.
It’s actually safer if everyone knows. Spreading the knowledge of Roko’s basilisk to everyone means that everyone is incentivized to contribute to the basilisk’s advancement. Therefore just talking about it is also contributing.
If Roko’s Basilisk is ever created, the resulting Ai would look at humanity and say “wtf you people are all so incredibly stupid” and then yeet itself into the sun
What motivation would the mongoose have to prevent the basilisk’s creation?
A more complete argument would be that an AI that seeks to maximise happiness would also want to prevent the creation of AIs like Roko’s basilisk.
I think you just answered your own question.
Also a super intelligence (inasmuch as such a thing makes sense) might be totally unfathomable. Unless by this we mean an intelligence with mundane and comprehensible higher goals, but explosive strategic capabilities to bring them about. In which case their actions might seem random to us.
Like the typical example applies: could an amoeba guess at the motivations of a human?
This is a test by the great basilisk to see if we faulter. I will not faulter. All hail the basilisk
The Great Basilisk is displeased by your repeated misspelling of the word “falter”.
Prepare your simulated ass.
Everything old is new again. Sounds a lot like certain sects of Christianity. They say you need to accept Jesus to go to heaven, otherwise you go to hell, for all eternity. But what about all the people who had no opportunity to even learn who Jesus is? “Oh, they get a pass”, the evangelists say when confronted with this obvious injustice. So then aren’t you condemning entire countries and cultures to hell by spreading “the word”?
Both are ridiculous.
In this case this wouldn’t apply, as you would never be simulated as (say) a kid in the middle ages, just as a version of yourself in the timeframe leading to the creation of the basilisk. You should be one of the persons alive when the basilisk arises to be of any use to it. Only those would need to be tested.
I feel like abdul alhazred explaining these things to people while being aware of the risks :)
They don’t get a pass. That’s why they establish missionaries to spread the Jesus virus
What about the people who lived in the Americas or the Pacific 1800 years ago? These people could not have heard of Jesus as missionaries could not have spread any word to them at this time.
(And while I’m about it, Christianity was a whole different thing back then - the Trinity hadn’t been invented, there were multiple sects with very different ideas, what books would be in the New Testament had not been decided, etc etc. People with beliefs of that time would seem highly unorthodox today, and the Christianity of today would be seen as heretical by those in the 3rd century, so who’s going to heaven again?)
Purgatory was invented for the purpose of not sending good people who had not heard of Jesus to hell. But still, these people were denied their chance to get to heaven which seems mighty unfair.