PhD thesis,

Supporting Researchers: Analyzing the Scholarly Publication Life Cycle and Social Bookmarking Systems

.
University of Kassel, (April 2017)

Abstract

Researchers must face the exponential growth of the body of available scholarly literature, which makes it ever harder to keep track with one’s own community, especially for newcomers. In this thesis, we explore different means of supporting researchers with that task. For this purpose, we follow two approaches: We provide analyses of research communities and of researchers’ interactions through data that can be obtained from the phases in the life cycle of scholarly publications (creation, dissemination, usage, and citation in other publications). The resulting statistics and visualizations allow researchers to better understand their own communities, to identify the most important players and publications, and to find valuable conversational partners at conferences. For the analysis of publication usage and connections to citations, we turn to social bookmarking systems and investigate the actions of users in BibSonomy. The provided insights can help operators of such systems improve them. Our second approach is more proactive, focusing on supporting researchers by pointing them directly to important publications – through automatically computed personalized recommendations and through social peer review. The analysis of research and researchers often relied on studying scholarly publications and their metadata. Such studies can reveal insights into how scientific work is conducted, they can shed light on communities and research topics, and they allow the measurement of certain forms of impact, a publication, an individual researcher, or a venue had. The exploited data – publication metadata– is generated when publications are created. The life cycle of a scholarly publication, however, just begins with a publication’s creation: Publications are disseminated (e.g., presented at conferences), they are used (e.g., acquired, stored, collected, marked as to-read, and, of course, read), and they are cited. With the advent of the Web 2.0, traces of the activities in these phases have become observable. In this thesis, we collect and analyze datasets from all four stages of the publication life cycle. We thus go beyond traditional means of scientometrics, touching such fields as altmetrics, web log analysis, and role discovery. We not only present new insights into communities that have not been investigated before, but we also demonstrate new means of analysis that are generalizable to other communities as well. Among them are formal concept analysis to visualize influences between groups of authors and social network analyses of interaction networks. Our datasets comprise – next to a traditional publication corpus containing metadata and references – a face-to-face contact network, gathered from real-live interactions of researchers during a conference, and datasets from the scholarly social bookmarking system BibSonomy. Social bookmarking services allow their users to publicly store and annotate resources, like web links, photos, videos, or publications. As representatives of the Web 2.0, social bookmarking systems have attracted the interest of the research community. Through the central feature, tagging of resources, users of such systems create a data structure called folksonomy, in which users, resources, and tags are connected. The resulting network allows users to navigate between these folksonomic entities. In scholarly bookmarking systems, users store and manage publications. Thus, such systems are an ideal candidate for the investigation of publication usage. In this thesis, we study data of the popular system BibSonomy to address various aspects of the use of social bookmarking systems and the therein stored resources. Moreover, we analyze the usefulness for altmetrics by studying correlations between the usage of a publication and its citations, as well as predictive power of usage-features over future citations. Scholarly bookmarking tools support researchers in their daily work with publications and their metadata. Still, the sheer number of available publications and its ever faster growth make it difficult to keep track of the relevant developments in one’s field of research – an instance of the information overload problem. Therefore, recommendation systems can be employed to point users to particular publications using personalized ranking algorithms. Usually, such algorithms exploit information in user profiles, for instance, previously stored resources and the according tags, as well as information about similarity between entities or about their positions within the network of entities (the folksonomy) to recommend new items that the active user might find interesting. Similarly, a recommender can also assist the process of tagging by recommending suitable tags to users while they create a new post for some resource. We use the scenario of tag recommendation to thoroughly analyze the typical evaluation setup of folksonomic recommender systems using so-called graph-cores. We improve the setup by introducing a new, more flexible type of core to circumvent a structural drawback of the graph-core approach. We also point to several pitfalls of using cores for benchmarking recommendation algorithms. Moreover, we employ the scenario of resource recommendation – specifically the recommendation of scholarly publications – to investigate different ways of integrating publication metadata into the popular and versatile folksonomic recommendation algorithm FolkRank. Finally, any tool that is offered on the web must comply with the law and its use must be socially compatible. Particularly difficult is the case of publicly visible ratings, where products are judged by users. For instance, in the case where resources are scholarly publications and thus the products of researchers (the authors), improper criticism may have consequences for researchers’ careers or for decisions about funding allocation. Based on requirements that have been derived from German law, we describe and discuss opportunities and risks of social web systems in which users share, debate, and rate scholarly publications. Altogether, this thesis relies on data from the scholarly publication life cycle to gain insights into research communities and the interaction of researchers with literature. We focus on social bookmarking systems, which reveal traces of its users’ behavior and which provide a suitable tool to support researchers in their work with literature. Our contributions aim at supporting researchers in their work, as members of their respective communities and as producers and consumers of scholarly literature.

Tags

Users

  • @dblp
  • @sdo

Comments and Reviews