Unsupervised Extraction of Conceptual Keyphrases from Abstracts
P. Ludwig, M. Thiel, and A. Nürnberger. Semantic Keyword-Based Search on Structured Data Sources: COST Action IC1302 Second International KEYSTONE Conference (IKC 2016) Revised Selected Papers, page 37-48. Cham, Springer International Publishing, (2017)
The extraction of meaningful keyphrases is important for a variety of applications, such as recommender systems, solutions for browsing of literature, or automatic categorization of documents. Since this task is not trivial, a great amount of different approaches have been introduced in the past, either focusing on single aspects of the process or utilizing the characteristics of a certain type of document. Especially when it comes to supporting the user in grasping the topics of a document (i.e. in the display of search results), precise keyphrases can be very helpful. However, in such situations usually only the abstract or a short excerpt is available, which most approaches do not acknowledge. Methods based on the frequency of words are not appropriate in this case, since the short texts do not contain sufficient word statistics for a frequency analysis. Secondly, many existing methods are supervised and therefore depend on domain knowledge or manually annotated data, which is in many scenarios not available. Therefore we present an unsupervised graph-based approach for extracting meaningful keyphrases from abstracts of scientific articles. We show that even though our method is not based on manually annotated data or corpora, it works surprisingly well.