Robots Aren’t Out to Get You. You Should Be Terrified of Them Anyway.

What's to come?
Sept. 11 2014 7:42 AM

You Should Be Terrified of Superintelligent Machines

Not because they might develop the worst human characteristics, but because they’re nothing like humans at all.

140910_FT_Superintelligence
An AI need not care intrinsically about food, air, temperature, energy expenditure, occurrence or threat of bodily injury, disease, predation, sex, or progeny.

Photo illustration by Slate. Photos by Thinkstock.

Adapted from Superintelligence: Paths, Dangers, Strategies by Nick Bostrom. Out now from Oxford University Press.

In the recent discussion over the risks of developing superintelligent machines—that is, machines with general intelligence greater than that of humans—two narratives have emerged. One side argues that if a machine ever achieved advanced intelligence, it would automatically know and care about human values and wouldn’t pose a threat to us. The opposing side argues that artificial intelligence would “want” to wipe humans out, either out of revenge or an intrinsic desire for survival.

As it turns out, both of these views are wrong. We have little reason to believe a superintelligence will necessarily share human values, and no reason to believe it would place intrinsic value on its own survival either. These arguments make the mistake of anthropomorphising artificial intelligence, projecting human emotions onto an entity that is fundamentally alien.  

Advertisement

Let us first reflect for a moment on the vastness of the space of possible minds. In this abstract space, human minds form a tiny cluster. Consider two persons who seem extremely unlike, perhaps Hannah Arendt and Benny Hill. The personality differences between these two individuals may seem almost maximally large. But this is because our intuitions are calibrated on our experience, which samples from the existing human distribution (and to some extent from fictional personalities constructed by the human imagination for the enjoyment of the human imagination). If we zoom out and consider the space of all possible minds, however, we must conceive of these two personalities as virtual clones. Certainly in terms of neural architecture, Ms. Arendt and Mr. Hill are nearly identical. Imagine their brains lying side by side in quiet repose. You would readily recognize them as two of a kind. You might even be unable to tell which brain belonged to whom. If you looked more closely, studying the morphology of the two brains under a microscope, this impression of fundamental similarity would only be strengthened: You would see the same lamellar organization of the cortex, with the same brain areas, made up of the same types of neuron, soaking in the same bath of neurotransmitters.

Despite the fact that human psychology corresponds to a tiny spot in the space of possible minds, there is a common tendency to project human attributes onto a wide range of alien or artificial cognitive systems. Yudkowsky illustrates this point nicely:

Back in the era of pulp science fiction, magazine covers occasionally depicted a sentient monstrous alien—colloquially known as a bug-eyed monster (BEM)—carrying off an attractive human female in a torn dress. It would seem the artist believed that a non-humanoid alien, with a wholly different evolutionary history, would sexually desire human females. ... Probably the artist did not ask whether a giant bug perceives human females as attractive. Rather, a human female in a torn dress is sexy—inherently so, as an intrinsic property. They who made this mistake did not think about the insectoid’s mind: they focused on the woman’s torn dress. If the dress were not torn, the woman would be less sexy; the BEM does not enter into it.

An artificial intelligence can be far less humanlike in its motivations than a green scaly space alien. The extraterrestrial (let us assume) is a biological creature that has arisen through an evolutionary process and can therefore be expected to have the kinds of motivation typical of evolved creatures. It would not be hugely surprising, for example, to find that some random intelligent alien would have motives related to one or more items like food, air, temperature, energy expenditure, occurrence or threat of bodily injury, disease, predation, sex, or progeny. A member of an intelligent social species might also have motivations related to cooperation and competition: Like us, it might show in-group loyalty, resentment of free riders, perhaps even a vain concern with reputation and appearance.

An AI, by contrast, need not care intrinsically about any of those things. There is nothing paradoxical about an AI whose sole final goal is to count the grains of sand on Boracay, or to calculate the decimal expansion of pi, or to maximize the total number of paper clips that will exist in its future light cone. In fact, it would be easier to create an AI with simple goals like these than to build one that had a humanlike set of values and dispositions. Compare how easy it is to write a program that measures how many digits of pi have been calculated and stored in memory with how difficult it would be to create a program that reliably measures the degree of realization of some more meaningful goal—human flourishing, say, or global justice.

In this sense, intelligence and final goals are “orthogonal”; that is: more or less any level of intelligence could in principle be combined with more or less any final goal.

Nevertheless, there are some instrumental goals likely to be pursued by almost any intelligent agent, because there are some objectives that are useful intermediaries to the achievement of almost any final goal.

If an agent’s final goals concern the future, then in many scenarios there will be future actions it could perform to increase the probability of achieving its goals. This creates an instrumental reason for the agent to try to be around in the future—to help achieve its future-oriented goal.

Most humans seem to place some final value on their own survival. This is not a necessary feature of artificial agents: Some may be designed to place no final value whatever on their own survival. Nevertheless, many agents that do not care intrinsically about their own survival would, under a fairly wide range of conditions, care instrumentally about their own survival in order to accomplish their final goals.