The stance detection task aims at detecting the
stance of a tweet or a text for a target. These
targets can be named entities or free-form sentences (claims). Though the task involves reasoning of the tweet with respect to a target, we
find that it is possible to achieve high accuracy
on several publicly available Twitter stance detection datasets without looking at the target
sentence. Specifically, a simple tweet classification model achieved human-level performance on the WT–WT dataset and more than
two-third accuracy on various other datasets.
We investigate the existence of biases in such
datasets to find the potential spurious correlations of sentiment-stance relations and lexical choice associated with the stance category.
Furthermore, we propose a new large dataset
free of such biases and demonstrate its aptness on the existing stance detection systems.
Our empirical findings show much scope for
research on the stance detection task and proposes several considerations for creating future stance detection datasets.
Description
tWT–WT: A Dataset to Assert the Role of Target Entities for Detecting Stance of Tweets