Readers of this blog may be familiar with the concept of “Friendly AI” — the project of making sure that artificial intelligences will do what we say without harming us (or, at the least, that they will not rise up and kill us all). In a recent issue of The New Atlantis, the authors of this blog have explored this idea at some length.
First, Charles T. Rubin, in his essay “Machine Morality and Human Responsibility,” uses Karel Čapek’s 1921 play R.U.R. — which introduced the word “robot” — to explore the different things people mean when they describe “Friendly AI,” and the conflicting motivations people have for wanting to create it. And he shows why it is that the play actually evinces a much deeper understanding of the meaning and stakes of engineering morality than can be found in the work of today’s Friendly AI researchers:
By design, the moral machine is a safe slave, doing what we want to have done and would rather not do for ourselves. Mastery over slaves is notoriously bad for the moral character of the masters, but all the worse, one might think, when their mastery becomes increasingly nominal.... The robot rebellion in the play just makes obvious what would have been true about the hierarchy between men and robots even if the design for robots had worked out exactly as their creators had hoped. The possibility that we are developing our “new robot overlords” is a joke with an edge to it precisely to the extent that there is unease about the question of what will be left for humans to do as we make it possible for ourselves to do less and less.
In “The Problem with ‘Friendly’ Artificial Intelligence,” a response to Professor Rubin’s essay, Adam Keiper and I further explore the motivations behind creating Friendly AI. We also delve into Mr. Yudkowsky’s specific proposal for how we are supposed to create Friendly AI, and we argue that a being that is sentient and autonomous but guaranteed to act “friendly” is a technical impossibility:
To state the problem in terms that Friendly AI researchers might concede, a utilitarian calculus is all well and good, but only when one has not only great powers of prediction about the likelihood of myriad possible outcomes, but certainty and consensus on how one values the different outcomes. Yet it is precisely the debate over just what those valuations should be that is the stuff of moral inquiry. And this is even more the case when all of the possible outcomes in a situation are bad, or when several are good but cannot all be had at once. Simply picking certain outcomes — like pain, death, bodily alteration, and violation of personal environment — and asserting them as absolute moral wrongs does nothing to resolve the difficulty of ethical dilemmas in which they are pitted against each other (as, fully understood, they usually are). Friendly AI theorists seem to believe that they have found a way to bypass all of the difficult questions of philosophy and ethics, but in fact they have just closed their eyes to them.
These are just short extracts from long essays with multi-pronged arguments — we might run longer excerpts here on Futurisms at some point, and as always, we welcome your feedback.