Misc,

The next generation discovery citation indexes : a review of the landscape in 2020

.
(November 2020)

Abstract

In terms of cross disciplinary citation indexes that are used for discovery, everyone knows of the two incumbants — Web of Science and Scopus(2004). Joined by the large web scale Google Scholar (2004), these three reigned as the “Big 3” of citation indexes for roughly a decade more or less unchallenged. However 10 years later, around 2015 and in the years after, a new generation of citation indexes started to emerge to challenge the big 3 in a variety of ways . As of time of writing in 2020, some of these new challengers have had a couple of years of development. How do things look now? First off, using newer techniques and paradigms, we have for-profit companies like Digital Science launching Dimensions (2018) which strike me as challengers to Scopus and Web of Science in the arena of citation/bibliometric assessment, just as Scopus itself was a challenge to the older Web of Science back in 2004. On the other end of the spectrum we have the rise of more “open” citation indexes . In particular, a very important player in this area is the relaunched Microsoft Academic(2016) which not only uses web crawling style technologies like Google Scholar to scour the web, applies the latest in Natural Language Processing (NLP) /“semantic” technologies and makes the dataset dubbed Microsoft Academic Graph (MAG) available with open licenses. Semantic Scholar(2015) is yet another project with Microsoft ties ( funded by the Allen Institute for AI) that play in the same area and releases data with open licenses. One of the more “Semantic” features of this search engine is that it types citations into whether the cite is for citing of background, methods or results using machine learning. While scite (2018) a new citation index by a startup does not provide open data, it’s selling point is the use of NLP to type citation relationships into “Supporting”, “Disputing” and “Neutral” cites which is yet another way of contextualizing research by describin citation relationships. Besides the two above mentioned well funded think tanks projects, we also see more grassroot like movements like 2017's I4OC (Intiative for open Citations) — an amazingly successful push to get publishers to deposit and make references open in Crossref as well as efforts by OpenCitations.net (a founding member of I4OC) to extract citations from open access papers from PMC to produce the OpenCitations Corpus (OCC), which have served to further increase the pool of Scholarly meta-data and citations that are available in the public domain/CCO.

Tags

Users

  • @lepsky

Comments and Reviews