Abstract
citation links to improve the scientific paper
classification performance. In this approach, we
develop two refinement functions, a linear label
refinement (LLR) and a probabilistic label refinement
(PLR), to model the citation link structures of the
scientific papers for refining the class labels of the
documents obtained by the content-based Naive Bayes
classification method. The approach with the two new
refinement models is examined and compared with the
content-based Naive Bayes method on a standard paper
classification data set with increasing training set
sizes. The results suggest that both refinement models
can significantly improve the system performance over
the content-based method for all the training set sizes
and that PLR is better than LLR when the training
examples are sufficient.
Users
Please
log in to take part in the discussion (add own reviews or comments).