Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification
V. Sinha, S. Rao, and V. Balasubramanian. (2018)cite arxiv:1803.02781Comment: 8 pages, 5 tables, 1 figure, KDD Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM) 2018.
Abstract
Many real world problems can now be effectively solved using supervised
machine learning. A major roadblock is often the lack of an adequate quantity
of labeled data for training. A possible solution is to assign the task of
labeling data to a crowd, and then infer the true label using aggregation
methods. A well-known approach for aggregation is the Dawid-Skene (DS)
algorithm, which is based on the principle of Expectation-Maximization (EM). We
propose a new simple, yet effective, EM-based algorithm, which can be
interpreted as a `hard' version of DS, that allows much faster convergence
while maintaining similar accuracy in aggregation. We show the use of this
algorithm as a quick and effective technique for online, real-time sentiment
annotation. We also prove that our algorithm converges to the estimated labels
at a linear rate. Our experiments on standard datasets show a significant
speedup in time taken for aggregation - upto $\sim$8x over Dawid-Skene and
$\sim$6x over other fast EM methods, at competitive accuracy performance. The
code for the implementation of the algorithms can be found at
https://github.com/GoodDeeds/Fast-Dawid-Skene
Description
[1803.02781] Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification
%0 Generic
%1 sinha2018dawidskene
%A Sinha, Vaibhav B
%A Rao, Sukrut
%A Balasubramanian, Vineeth N
%D 2018
%K aggregation classification computing crowd crowdsourcing crowdworker expert human label sentiment social vote
%T Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification
%U http://arxiv.org/abs/1803.02781
%X Many real world problems can now be effectively solved using supervised
machine learning. A major roadblock is often the lack of an adequate quantity
of labeled data for training. A possible solution is to assign the task of
labeling data to a crowd, and then infer the true label using aggregation
methods. A well-known approach for aggregation is the Dawid-Skene (DS)
algorithm, which is based on the principle of Expectation-Maximization (EM). We
propose a new simple, yet effective, EM-based algorithm, which can be
interpreted as a `hard' version of DS, that allows much faster convergence
while maintaining similar accuracy in aggregation. We show the use of this
algorithm as a quick and effective technique for online, real-time sentiment
annotation. We also prove that our algorithm converges to the estimated labels
at a linear rate. Our experiments on standard datasets show a significant
speedup in time taken for aggregation - upto $\sim$8x over Dawid-Skene and
$\sim$6x over other fast EM methods, at competitive accuracy performance. The
code for the implementation of the algorithms can be found at
https://github.com/GoodDeeds/Fast-Dawid-Skene
@misc{sinha2018dawidskene,
abstract = {Many real world problems can now be effectively solved using supervised
machine learning. A major roadblock is often the lack of an adequate quantity
of labeled data for training. A possible solution is to assign the task of
labeling data to a crowd, and then infer the true label using aggregation
methods. A well-known approach for aggregation is the Dawid-Skene (DS)
algorithm, which is based on the principle of Expectation-Maximization (EM). We
propose a new simple, yet effective, EM-based algorithm, which can be
interpreted as a `hard' version of DS, that allows much faster convergence
while maintaining similar accuracy in aggregation. We show the use of this
algorithm as a quick and effective technique for online, real-time sentiment
annotation. We also prove that our algorithm converges to the estimated labels
at a linear rate. Our experiments on standard datasets show a significant
speedup in time taken for aggregation - upto $\sim$8x over Dawid-Skene and
$\sim$6x over other fast EM methods, at competitive accuracy performance. The
code for the implementation of the algorithms can be found at
https://github.com/GoodDeeds/Fast-Dawid-Skene},
added-at = {2020-10-21T11:50:06.000+0200},
author = {Sinha, Vaibhav B and Rao, Sukrut and Balasubramanian, Vineeth N},
biburl = {https://www.bibsonomy.org/bibtex/28e672879b9c2f710c124699c4156b22e/jaeschke},
description = {[1803.02781] Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification},
interhash = {9bd6913521b6434d60b4320597da333b},
intrahash = {8e672879b9c2f710c124699c4156b22e},
keywords = {aggregation classification computing crowd crowdsourcing crowdworker expert human label sentiment social vote},
note = {cite arxiv:1803.02781Comment: 8 pages, 5 tables, 1 figure, KDD Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM) 2018},
timestamp = {2020-10-21T11:50:06.000+0200},
title = {Fast Dawid-Skene: A Fast Vote Aggregation Scheme for Sentiment Classification},
url = {http://arxiv.org/abs/1803.02781},
year = 2018
}