Abstract
What would it take to teach a machine to behave ethically? While broad
ethical rules may seem straightforward to state ("thou shalt not kill"),
applying such rules to real-world situations is far more complex. For example,
while "helping a friend" is generally a good thing to do, "helping a friend
spread fake news" is not. We identify four underlying challenges towards
machine ethics and norms: (1) an understanding of moral precepts and social
norms; (2) the ability to perceive real-world situations visually or by reading
natural language descriptions; (3) commonsense reasoning to anticipate the
outcome of alternative actions in different contexts; (4) most importantly, the
ability to make ethical judgments given the interplay between competing values
and their grounding in different contexts (e.g., the right to freedom of
expression vs. preventing the spread of fake news).
Our paper begins to address these questions within the deep learning
paradigm. Our prototype model, Delphi, demonstrates strong promise of
language-based commonsense moral reasoning, with up to 92.1% accuracy vetted by
humans. This is in stark contrast to the zero-shot performance of GPT-3 of
52.3%, which suggests that massive scale alone does not endow pre-trained
neural language models with human values. Thus, we present Commonsense Norm
Bank, a moral textbook customized for machines, which compiles 1.7M examples of
people's ethical judgments on a broad spectrum of everyday situations. In
addition to the new resources and baseline performances for future research,
our study provides new insights that lead to several important open research
questions: differentiating between universal human values and personal values,
modeling different moral frameworks, and explainable, consistent approaches to
machine ethics.
Description
Delphi: Towards Machine Ethics and Norms
Links and resources
Tags