copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Using Word Embeddings in Twitter Election Classification

X. Yang, C. Macdonald, and I. Ounis. (2016)cite arxiv:1606.07006Comment: NeuIR Workshop 2016.

Abstract

Word embeddings and convolutional neural networks (CNN) have attracted extensive attention in various classification tasks for Twitter, e.g. sentiment classification. However, the effect of the configuration used to train and generate the word embeddings on the classification performance has not been studied in the existing literature. In this paper, using a Twitter election classification task that aims to detect election-related tweets, we investigate the impact of the background dataset used to train the embedding models, the context window size and the dimensionality of word embeddings on the classification performance. By comparing the classification results of two word embedding models, which are trained using different background corpora (e.g. Wikipedia articles and Twitter microposts), we show that the background data type should align with the Twitter classification dataset to achieve a better performance. Moreover, by evaluating the results of word embeddings models trained using various context window sizes and dimensionalities, we found that large context window and dimension sizes are preferable to improve the performance. Our experimental results also show that using word embeddings and CNN leads to statistically significant improvements over various baselines such as random, SVM with TF-IDF and SVM with word embeddings.

Description

Using Word Embeddings in Twitter Election Classification

Links and resources

BibTeX key: yang2016using
entry type: inproceedings
year: 2016
url: http://arxiv.org/abs/1606.07006
note: cite arxiv:1606.07006Comment: NeuIR Workshop 2016

@schwemmlein's tags highlighted

Cite this publication

@inproceedings{yang2016using, abstract = {Word embeddings and convolutional neural networks (CNN) have attracted extensive attention in various classification tasks for Twitter, e.g. sentiment classification. However, the effect of the configuration used to train and generate the word embeddings on the classification performance has not been studied in the existing literature. In this paper, using a Twitter election classification task that aims to detect election-related tweets, we investigate the impact of the background dataset used to train the embedding models, the context window size and the dimensionality of word embeddings on the classification performance. By comparing the classification results of two word embedding models, which are trained using different background corpora (e.g. Wikipedia articles and Twitter microposts), we show that the background data type should align with the Twitter classification dataset to achieve a better performance. Moreover, by evaluating the results of word embeddings models trained using various context window sizes and dimensionalities, we found that large context window and dimension sizes are preferable to improve the performance. Our experimental results also show that using word embeddings and CNN leads to statistically significant improvements over various baselines such as random, SVM with TF-IDF and SVM with word embeddings.}, added-at = {2018-01-17T13:31:34.000+0100}, author = {Yang, Xiao and Macdonald, Craig and Ounis, Iadh}, biburl = {https://www.bibsonomy.org/bibtex/2bd1b364287eaddbe1e9e41740ce92b85/schwemmlein}, description = {Using Word Embeddings in Twitter Election Classification}, interhash = {3b401f1c3d9933940225b8870da6bc73}, intrahash = {bd1b364287eaddbe1e9e41740ce92b85}, keywords = {cnn embeddings nlp svm twitter word}, note = {cite arxiv:1606.07006Comment: NeuIR Workshop 2016}, timestamp = {2018-09-05T16:38:48.000+0200}, title = {Using Word Embeddings in Twitter Election Classification}, url = {http://arxiv.org/abs/1606.07006}, year = 2016 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Using Word Embeddings in Twitter Election Classification

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Using Word Embeddings in Twitter Election Classification

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Using Word Embeddings in Twitter Election Classification

Comments and Reviews
(0)