Cheap and Fast -- But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
R. Snow, B. O'Connor, D. Jurafsky, and A. Ng. Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, page 254--263. Honolulu, Hawaii, Association for Computational Linguistics, (October 2008)
Abstract
Human linguistic annotation is crucial for many natural language
processing tasks but can be expensive and time-consuming. We
explore the use of Amazon's Mechanical Turk system, a
significantly cheaper and faster method for collecting
annotations from a broad base of paid non-expert contributors
over the Web. We investigate five tasks: affect recognition,
word similarity, recognizing textual entailment, event temporal
ordering, and word sense disambiguation. For all five, we show
high agreement between Mechanical Turk non-expert annotations and
existing gold standard labels provided by expert labelers. For
the task of affect recognition, we also show that using
non-expert labels for training machine learning algorithms can be
as effective as using gold standard annotations from experts. We
propose a technique for bias correction that significantly
improves annotation quality on two tasks. We conclude that many
large labeling tasks can be effectively designed and carried out
in this method at a fraction of the usual expense.
Description
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks - ACL Anthology
%0 Conference Paper
%1 snow2008cheap
%A Snow, Rion
%A O'Connor, Brendan
%A Jurafsky, Daniel
%A Ng, Andrew
%B Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing
%C Honolulu, Hawaii
%D 2008
%I Association for Computational Linguistics
%K annotation computing crowdsourcing crowdworker evaluation expert human language natural nlp processing social
%P 254--263
%T Cheap and Fast -- But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks
%U https://www.aclweb.org/anthology/D08-1027
%X Human linguistic annotation is crucial for many natural language
processing tasks but can be expensive and time-consuming. We
explore the use of Amazon's Mechanical Turk system, a
significantly cheaper and faster method for collecting
annotations from a broad base of paid non-expert contributors
over the Web. We investigate five tasks: affect recognition,
word similarity, recognizing textual entailment, event temporal
ordering, and word sense disambiguation. For all five, we show
high agreement between Mechanical Turk non-expert annotations and
existing gold standard labels provided by expert labelers. For
the task of affect recognition, we also show that using
non-expert labels for training machine learning algorithms can be
as effective as using gold standard annotations from experts. We
propose a technique for bias correction that significantly
improves annotation quality on two tasks. We conclude that many
large labeling tasks can be effectively designed and carried out
in this method at a fraction of the usual expense.
@inproceedings{snow2008cheap,
abstract = {Human linguistic annotation is crucial for many natural language
processing tasks but can be expensive and time-consuming. We
explore the use of Amazon's Mechanical Turk system, a
significantly cheaper and faster method for collecting
annotations from a broad base of paid non-expert contributors
over the Web. We investigate five tasks: affect recognition,
word similarity, recognizing textual entailment, event temporal
ordering, and word sense disambiguation. For all five, we show
high agreement between Mechanical Turk non-expert annotations and
existing gold standard labels provided by expert labelers. For
the task of affect recognition, we also show that using
non-expert labels for training machine learning algorithms can be
as effective as using gold standard annotations from experts. We
propose a technique for bias correction that significantly
improves annotation quality on two tasks. We conclude that many
large labeling tasks can be effectively designed and carried out
in this method at a fraction of the usual expense.},
added-at = {2020-10-21T12:48:47.000+0200},
address = {Honolulu, Hawaii},
author = {Snow, Rion and O'Connor, Brendan and Jurafsky, Daniel and Ng, Andrew},
biburl = {https://www.bibsonomy.org/bibtex/2287dc13d201058313d7a17e462b6af7e/jaeschke},
booktitle = {Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing},
description = {Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks - ACL Anthology},
interhash = {fa24c8db32accb61e8bf025b23574986},
intrahash = {287dc13d201058313d7a17e462b6af7e},
keywords = {annotation computing crowdsourcing crowdworker evaluation expert human language natural nlp processing social},
month = oct,
pages = {254--263},
publisher = {Association for Computational Linguistics},
timestamp = {2020-10-21T12:48:47.000+0200},
title = {Cheap and Fast -- But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks},
url = {https://www.aclweb.org/anthology/D08-1027},
year = 2008
}