Roko’s Basilisk

Roko’s basilisk (RB) is a thought experiment proposed on a Less Wrong (LW) discussion forum. The thought experiment uses LW’s attempts to improve decision theory and the idea of super intelligent AIs. 

Decision theory concerns rational agents, that is, physical systems that rank different physical states of affairs on a single scale of value. These agents may be considered as playing games in which each possible consequence of a particular action has some payoff associated with it. The value of a game is the payoff such that the agent is neutral between receiving that payoff and playing the game. 

A superintelligent AI is an artificial intelligence so intelligent that human beings can’t understand what it will do or why.

RB is a superintelligent AI that will, after it is created, torture anyone who knew enough about RB and who didn’t contribute to causes that help bring about its existence. Knowing enough means being able to understand the RB well enough to know that it will follow through on its threats of torture. For example, if a sufficiently large chunk of space rock hits the Earth fast enough the human species might be destroyed and so we wouldn’t create a super intelligent AI. So if RB found out that I didn’t donate to the cause of destroying space rocks, or moving some humans to Mars so they would survive such an event, then it might torture me. RB would act like this to help blackmail people to create it.

Some people found this post very upsetting. Eliezer Yudkowsky (EY) initially responded to this idea by banning Roko and discussion of RB to avoid the creation of RB and to spare people upset. He also said that it wasn’t clear whether anyone could imagine RB in enough detail to give it an incentive to torture people but he didn’t want to take the chance.

I have previously discussed superintelligences and there are some points that are relevant here. First, there is only one process by which knowledge is generated: variations on existing ideas and selections among those variations. There is no intelligence that could operate in a way that people couldn’t understand. So superintelligences as imagined by fantasists such as Nick Bostrom and EY are impossible. The second point is that if an AI wanted to do something immoral like torture people we have institutions for dealing with people who use force. These institutions are flawed and worth improving, but this is hardly an unheard of problem so it’s weird to act as if nobody has ever thought about it before LW.

Decision theory is also not a good model for making decisions that involve any non-trivial amount of knowledge creation. Knowledge creation will produce possibilities that you can’t know about in advance. You can’t attach payoffs to these new situations and so you can’t use any variant of decision theory to make decisions about them. 

Another problem with RB is that torture and threats distract people from creating knowledge and toward appeasing or undermining the person making the threats. Force undermines the growth of knowledge. It doesn’t promote the growth of knowledge. Since creating new knowledge is required to make an AI, RB would be undermining the conditions required for its own creation. Since LW don’t know this they don’t understand epistemology well enough to create an AI and there is no reason to worry about them creating RB or any AI.

The fact that EY’s first response to RB was to ban discussion of it rather than thoroughly refute it also sez something quite bad about EY and LW in general. LW also banned Elliot Temple for criticising their ideas, so the ban on discussion of RB is an example of a general policy not a one off incident.

About conjecturesandrefutations
My name is Alan Forrester. I am interested in science and philosophy: especially David Deutsch, Ayn Rand, Karl Popper and William Godwin.

Leave a comment