Sentence Subjectivity Detection with Weakly-Supervised Learning
C. Lin, Y. He, and R. Everson. The 5th International Joint Conference on Natural Language Processing (IJCNLP), Chiang Mai, Thailand, (November 2011)
Abstract
This paper presents a hierarchical Bayesian model based on latent Dirichlet allocation (LDA), called subjLDA, for sentence-level subjectivity detection,
which automatically identifies whether a given sentence expresses opinion or states facts. In contrast to most of the existing methods relying on either
labelled corpora for classifier training or linguistic pattern extraction for subjectivity classification, we view the problem as weakly-supervised generative
model learning, where the only input to the model is a small set of domain independent subjectivity lexical clues. A mechanism is introduced to incorporate
the prior information about the subjectivity lexical clues into model learning by modifying the Dirichlet priors of topic-word distributions. The subjLDA model has been evaluated on the Multi-Perspective Question Answering (MPQA) dataset and promising results have been observed in the preliminary experiments. We have also explored adding neutral words as prior information for model learning. It was found that while incorporating subjectivity clues bearing positive or negative polarity can achieve a significant performance gain, the prior lexical information from neutral words is less effective.
%0 Conference Paper
%1 ijcnlp2011
%A Lin, Chenghua
%A He, Yulan
%A Everson, Richard
%B The 5th International Joint Conference on Natural Language Processing (IJCNLP)
%C Chiang Mai, Thailand
%D 2011
%K myown robust-project subjectivity
%T Sentence Subjectivity Detection with Weakly-Supervised Learning
%U http://aclweb.org/anthology-new/I/I11/I11-1129.pdf
%X This paper presents a hierarchical Bayesian model based on latent Dirichlet allocation (LDA), called subjLDA, for sentence-level subjectivity detection,
which automatically identifies whether a given sentence expresses opinion or states facts. In contrast to most of the existing methods relying on either
labelled corpora for classifier training or linguistic pattern extraction for subjectivity classification, we view the problem as weakly-supervised generative
model learning, where the only input to the model is a small set of domain independent subjectivity lexical clues. A mechanism is introduced to incorporate
the prior information about the subjectivity lexical clues into model learning by modifying the Dirichlet priors of topic-word distributions. The subjLDA model has been evaluated on the Multi-Perspective Question Answering (MPQA) dataset and promising results have been observed in the preliminary experiments. We have also explored adding neutral words as prior information for model learning. It was found that while incorporating subjectivity clues bearing positive or negative polarity can achieve a significant performance gain, the prior lexical information from neutral words is less effective.
@inproceedings{ijcnlp2011,
abstract = {This paper presents a hierarchical Bayesian model based on latent Dirichlet allocation (LDA), called subjLDA, for sentence-level subjectivity detection,
which automatically identifies whether a given sentence expresses opinion or states facts. In contrast to most of the existing methods relying on either
labelled corpora for classifier training or linguistic pattern extraction for subjectivity classification, we view the problem as weakly-supervised generative
model learning, where the only input to the model is a small set of domain independent subjectivity lexical clues. A mechanism is introduced to incorporate
the prior information about the subjectivity lexical clues into model learning by modifying the Dirichlet priors of topic-word distributions. The subjLDA model has been evaluated on the Multi-Perspective Question Answering (MPQA) dataset and promising results have been observed in the preliminary experiments. We have also explored adding neutral words as prior information for model learning. It was found that while incorporating subjectivity clues bearing positive or negative polarity can achieve a significant performance gain, the prior lexical information from neutral words is less effective.},
added-at = {2011-09-23T13:06:41.000+0200},
address = {Chiang Mai, Thailand},
author = {Lin, Chenghua and He, Yulan and Everson, Richard},
biburl = {https://www.bibsonomy.org/bibtex/291c68dbc0e820cef480cd5bf699f0a26/yulanhe},
booktitle = {The 5th International Joint Conference on Natural Language Processing (IJCNLP)},
interhash = {4bca588885471e1eec448ea7921ce289},
intrahash = {91c68dbc0e820cef480cd5bf699f0a26},
keywords = {myown robust-project subjectivity},
month = nov,
timestamp = {2012-11-09T11:59:01.000+0100},
title = {Sentence Subjectivity Detection with Weakly-Supervised Learning},
url = {http://aclweb.org/anthology-new/I/I11/I11-1129.pdf},
year = 2011
}