Distributed Indexing of Large-Scale Web Collections

Abstract

Sidra is a new indexing and ranking system for large-scale Web collections. Sidra creates multiple distributed indexes, organized and partitioned by different ranking criteria, aimed at supporting contextualized queries over hypertexts and their metadata. This paper presents the architecture of Sidra and the algorithms used to create its indexes. Performance measurements on the Portuguese Web data show that Sidra's indexing times and scalability are comparable to those of global Web search engines.

BibTeX key: costa2005distributed
entry type: article
year: 2005
month: mar
journal: IEEE Latin America Transactions
number: 1
pages: 2-8
volume: 3
issn: 1548-0992
DOI: 10.1109/TLA.2005.1468656
url: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1468656

BibSonomy

Distributed Indexing of Large-Scale Web Collections

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on