copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Keyphrase Extraction in Scholarly Digital Library Search Engines

K. Patel, C. Caragea, J. Wu, and C. Giles. Web Services -- ICWS 2020, page 179--196. Cham, Springer International Publishing, (2020)

Abstract

Scholarly digital libraries provide access to scientific publications and comprise useful resources for researchers who search for literature on specific subject areas. CiteSeerX is an example of such a digital library search engine that provides access to more than 10 million academic documents and has nearly one million users and three million hits per day. Artificial Intelligence (AI) technologies are used in many components of CiteSeerX including Web crawling, document ingestion, and metadata extraction. CiteSeerX also uses an unsupervised algorithm called noun phrase chunking (NP-Chunking) to extract keyphrases out of documents. However, often NP-Chunking extracts many unimportant noun phrases. In this paper, we investigate and contrast three supervised keyphrase extraction models to explore their deployment in CiteSeerX for extracting high quality keyphrases. To perform user evaluations on the keyphrases predicted by different models, we integrate a voting interface into CiteSeerX. We show the development and deployment of the keyphrase extraction models and the maintenance requirements.

Description

Keyphrase Extraction in Scholarly Digital Library Search Engines | SpringerLink

@brusilovsky's tags highlighted

Cite this publication

@inproceedings{10.1007/978-3-030-59618-7_12, abstract = {Scholarly digital libraries provide access to scientific publications and comprise useful resources for researchers who search for literature on specific subject areas. CiteSeerX is an example of such a digital library search engine that provides access to more than 10 million academic documents and has nearly one million users and three million hits per day. Artificial Intelligence (AI) technologies are used in many components of CiteSeerX including Web crawling, document ingestion, and metadata extraction. CiteSeerX also uses an unsupervised algorithm called noun phrase chunking (NP-Chunking) to extract keyphrases out of documents. However, often NP-Chunking extracts many unimportant noun phrases. In this paper, we investigate and contrast three supervised keyphrase extraction models to explore their deployment in CiteSeerX for extracting high quality keyphrases. To perform user evaluations on the keyphrases predicted by different models, we integrate a voting interface into CiteSeerX. We show the development and deployment of the keyphrase extraction models and the maintenance requirements.}, added-at = {2020-09-23T19:13:08.000+0200}, address = {Cham}, author = {Patel, Krutarth and Caragea, Cornelia and Wu, Jian and Giles, C. Lee}, biburl = {https://www.bibsonomy.org/bibtex/2bee9a47da90be35a466b9d445e3aca40/brusilovsky}, booktitle = {Web Services -- ICWS 2020}, description = {Keyphrase Extraction in Scholarly Digital Library Search Engines | SpringerLink}, editor = {Ku, Wei-Shinn and Kanemasa, Yasuhiko and Serhani, Mohamed Adel and Zhang, Liang-Jie}, interhash = {c7d7706142b6c6a7b7c4b858fe410a3d}, intrahash = {bee9a47da90be35a466b9d445e3aca40}, isbn = {978-3-030-59618-7}, keywords = {academic-reference concept-extraction}, pages = {179--196}, publisher = {Springer International Publishing}, timestamp = {2020-09-23T19:13:08.000+0200}, title = {Keyphrase Extraction in Scholarly Digital Library Search Engines}, year = 2020 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Keyphrase Extraction in Scholarly Digital Library Search Engines

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Keyphrase Extraction in Scholarly Digital Library Search Engines

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Keyphrase Extraction in Scholarly Digital Library Search Engines

Comments and Reviews
(0)