A System for Citations Retrieval on the Web

A System for Citations Retrieval on the Web

A fundamental feature of research papers is how many times they are cited in other articles, i.e. how many later references to them there are. That is the only objective way of evaluation how important or novel a paper's ideas are. With an increasing number of articles available online, it has become possible to find these citations in a more or less automated way. This thesis first describes existing possibilities of citations retrieval and indexing and then introduces CiteSeeker – a tool for a fully automated citations retrieval. CiteSeeker starts crawling the World Wide Web from given start points and searches for specified authors and publications in a fuzzy manner. That means that certain inaccuracies in the search strings are taken into account. CiteSeeker treats all common Internet file formats, including PostScript and PDF documents and archives. The project is based on the .NET technology.

Keywords: Citations, Retrieval, Web, Fuzzy Search, .NET, C#

Year: 2003

Download: download Full text [741 kB]