Mining citation information from CiteSeer data

Mining citation information from CiteSeer data

The CiteSeer digital library is a useful source of bibliographic information. It allows for retrieving citations, co-authorships, addresses, and affiliations of authors and publications. In spite of this, it has been relatively rarely used for automated citation analyses. This article describes our findings after extensively mining from the CiteSeer data. We explored citations between authors and determined rankings of influential scientists using various evaluation methods including citation and in-degree counts, HITS, PageRank, and its variations based on both the citation and collaboration graphs. We compare the resulting rankings with lists of computer science award winners and find out that award recipients are almost always ranked high. We conclude that CiteSeer is a valuable, yet not fully appreciated, repository of citation data and is appropriate for testing novel bibliometric methods.
The available full text is a preprint of the article.

Keywords: CiteSeer, citation analysis, rankings, evaluation.

Year: 2011

Journal ISSN: 0138-9130
Download: download Full text [302 kB]
View record in Web of Science®