Article,

Distributed Indexing of Large-Scale Web Collections

, and .
IEEE Latin America Transactions, 3 (1): 2-8 (March 2005)
DOI: 10.1109/TLA.2005.1468656

Abstract

Sidra is a new indexing and ranking system for large-scale Web collections. Sidra creates multiple distributed indexes, organized and partitioned by different ranking criteria, aimed at supporting contextualized queries over hypertexts and their metadata. This paper presents the architecture of Sidra and the algorithms used to create its indexes. Performance measurements on the Portuguese Web data show that Sidra's indexing times and scalability are comparable to those of global Web search engines.

Tags

Users

  • @jaeschke

Comments and Reviews