Pretrained language models, especially masked language models (MLMs) have
seen success across many NLP tasks. However, there is ample evidence that they
use the cultural biases that are undoubtedly present in the corpora they are
trained on, implicitly creating harm with biased representations. To measure
some forms of social bias in language models against protected demographic
groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark
(CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing
with nine types of bias, like race, religion, and age. In CrowS-Pairs a model
is presented with two sentences: one that is more stereotyping and another that
is less stereotyping. The data focuses on stereotypes about historically
disadvantaged groups and contrasts them with advantaged groups. We find that
all three of the widely-used MLMs we evaluate substantially favor sentences
that express stereotypes in every category in CrowS-Pairs. As work on building
less biased models advances, this dataset can be used as a benchmark to
evaluate progress.
Description
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
%0 Conference Paper
%1 nangia2020crowspairs
%A Nangia, Nikita
%A Vania, Clara
%A Bhalerao, Rasika
%A Bowman, Samuel R.
%B EMNLP
%D 2020
%K bias dataset language lm models
%T CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked
Language Models
%U https://www.aclweb.org/anthology/2020.emnlp-main.154/
%X Pretrained language models, especially masked language models (MLMs) have
seen success across many NLP tasks. However, there is ample evidence that they
use the cultural biases that are undoubtedly present in the corpora they are
trained on, implicitly creating harm with biased representations. To measure
some forms of social bias in language models against protected demographic
groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark
(CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing
with nine types of bias, like race, religion, and age. In CrowS-Pairs a model
is presented with two sentences: one that is more stereotyping and another that
is less stereotyping. The data focuses on stereotypes about historically
disadvantaged groups and contrasts them with advantaged groups. We find that
all three of the widely-used MLMs we evaluate substantially favor sentences
that express stereotypes in every category in CrowS-Pairs. As work on building
less biased models advances, this dataset can be used as a benchmark to
evaluate progress.
@inproceedings{nangia2020crowspairs,
abstract = {Pretrained language models, especially masked language models (MLMs) have
seen success across many NLP tasks. However, there is ample evidence that they
use the cultural biases that are undoubtedly present in the corpora they are
trained on, implicitly creating harm with biased representations. To measure
some forms of social bias in language models against protected demographic
groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark
(CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing
with nine types of bias, like race, religion, and age. In CrowS-Pairs a model
is presented with two sentences: one that is more stereotyping and another that
is less stereotyping. The data focuses on stereotypes about historically
disadvantaged groups and contrasts them with advantaged groups. We find that
all three of the widely-used MLMs we evaluate substantially favor sentences
that express stereotypes in every category in CrowS-Pairs. As work on building
less biased models advances, this dataset can be used as a benchmark to
evaluate progress.},
added-at = {2021-01-25T14:06:03.000+0100},
author = {Nangia, Nikita and Vania, Clara and Bhalerao, Rasika and Bowman, Samuel R.},
biburl = {https://www.bibsonomy.org/bibtex/234ccbcab0e3f47a309f106a342728fc4/schwemmlein},
booktitle = {EMNLP},
description = {CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models},
interhash = {947454ab0531c9d16603ec6e98ef3d98},
intrahash = {34ccbcab0e3f47a309f106a342728fc4},
keywords = {bias dataset language lm models},
note = {cite arxiv:2010.00133Comment: EMNLP 2020},
timestamp = {2021-01-25T14:06:03.000+0100},
title = {CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked
Language Models},
url = {https://www.aclweb.org/anthology/2020.emnlp-main.154/},
year = 2020
}