Article,

Characterizing and Mining the Citation Graph of the Computer Science Literature

Y. An, J. Janssen, and E. Milios.
Knowledge and Information Systems, 6 (6): 664--678 (November 2004)
DOI: 10.1007/s10115-003-0128-3

Abstract

Citation graphs representing a body of scientific literature convey measures of scholarly activity and productivity. In this work we present a study of the structure of the citation graph of the computer science literature. Using a web robot we built several topic-specific citation graphs and their union graph from the digital library ResearchIndex. After verifying that the degree distributions follow a power law, we applied a series of graph theoretical algorithms to elicit an aggregate picture of the citation graph in terms of its connectivity. We discovered the existence of a single large weakly-connected and a single large biconnected component, and confirmed the expected lack of a large strongly-connected component. The large components remained even after removing the strongest authority nodes or the strongest hub nodes, indicating that such tight connectivity is widespread and does not depend on a small subset of important nodes. Finally, minimum cuts between authority papers of different areas did not result in a balanced partitioning of the graph into areas, pointing to the need for more sophisticated algorithms for clustering the graph.

BibTeX key: an2004characterizing
entry type: article
address: London
year: 2004
month: nov
journal: Knowledge and Information Systems
number: 6
pages: 664--678
publisher: Springer
volume: 6
issn: 0219-1377
acmid: 1031388
issue: 6
numpages: 15
DOI: 10.1007/s10115-003-0128-3
url: http://dx.doi.org/10.1007/s10115-003-0128-3

Users

Comments and Reviewsshow / hide

@jaeschke 13 years ago
The reported power-law exponent of 1.7 (Fig. 3) seems to be a bit low. [1] reports a value of 3, [2] states that the exponent α "typically lies in the range 2 < α < 3". I suspect that the authors provide the exponent for the cumulative distribution, which - according to [2] - differs from the "real" exponent by one. Hence, the exponent found in this work might be 2.7 which is closer to prior observations. [1] How popular is your paper? An empirical study of the citation distribution S. Redner European Physical Journal B 4(2):131--134 (August 1998) http://www.bibsonomy.org/bibtex/2e64d14f3207766f4afc65983fa759ffe/jaeschke [2] Power-Law Distributions in Empirical Data Aaron Clauset, Cosma Rohilla Shalizi, and M. E. J. Newman SIAM Review 51(4):661--703 (2009) http://www.bibsonomy.org/bibtex/2c0097d202655474b1db6811ddea03410/jaeschke
References
Bookmarks
deleting review

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{an2004characterizing, abstract = {Citation graphs representing a body of scientific literature convey measures of scholarly activity and productivity. In this work we present a study of the structure of the citation graph of the computer science literature. Using a web robot we built several topic-specific citation graphs and their union graph from the digital library ResearchIndex. After verifying that the degree distributions follow a power law, we applied a series of graph theoretical algorithms to elicit an aggregate picture of the citation graph in terms of its connectivity. We discovered the existence of a single large weakly-connected and a single large biconnected component, and confirmed the expected lack of a large strongly-connected component. The large components remained even after removing the strongest authority nodes or the strongest hub nodes, indicating that such tight connectivity is widespread and does not depend on a small subset of important nodes. Finally, minimum cuts between authority papers of different areas did not result in a balanced partitioning of the graph into areas, pointing to the need for more sophisticated algorithms for clustering the graph.}, acmid = {1031388}, added-at = {2011-12-21T22:49:38.000+0100}, address = {London}, author = {An, Yuan and Janssen, Jeannette and Milios, Evangelos E.}, biburl = {https://www.bibsonomy.org/bibtex/22fe1a8e5fdeb537973491ad334acb0ea/jaeschke}, doi = {10.1007/s10115-003-0128-3}, interhash = {73fdd0592c1641d05da5d2323d9f59ae}, intrahash = {2fe1a8e5fdeb537973491ad334acb0ea}, issn = {0219-1377}, issue = {6}, journal = {Knowledge and Information Systems}, keywords = {}, month = nov, number = 6, numpages = {15}, pages = {664--678}, publisher = {Springer}, timestamp = {2011-12-21T22:49:38.000+0100}, title = {Characterizing and Mining the Citation Graph of the Computer Science Literature}, url = {http://dx.doi.org/10.1007/s10115-003-0128-3}, volume = 6, year = 2004 }

BibSonomy

Characterizing and Mining the Citation Graph of the Computer Science Literature

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on