@jaeschke

AggregateRank: Bringing order to web sites

, , , , , , and . Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR \textquotesingle06, ACM Press, (2006)
DOI: 10.1145/1148170.1148187

Abstract

Since the website is one of the most important organizational structures of the Web, how to effectively rank websites has been essential to many Web applications, such as Web search and crawling. In order to get the ranks of websites, researchers used to describe the inter-connectivity among websites with a so-called HostGraph in which the nodes denote websites and the edges denote linkages between websites (if and only if there are hyperlinks from the pages in one website to the pages in the other, there will be an edge between these two websites), and then adopted the random walk model in the HostGraph. However, as pointed in this paper, the random walk over such a HostGraph is not reasonable because it is not in accordance with the browsing behavior of web surfers. Therefore, the derivate rank cannot represent the true probability of visiting the corresponding website.In this work, we mathematically proved that the probability of visiting a website by the random web surfer should be equal to the sum of the PageRank values of the pages inside that website. Nevertheless, since the number of web pages is much larger than that of websites, it is not feasible to base the calculation of the ranks of websites on the calculation of PageRank. To tackle this problem, we proposed a novel method named AggregateRank rooted in the theory of stochastic complement, which cannot only approximate the sum of PageRank accurately, but also have a lower computational complexity than PageRank. Both theoretical analysis and experimental evaluation show that AggregateRank is a better method for ranking websites than previous methods.

Description

AggregateRank | Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval

Links and resources

Tags

community

  • @jaeschke
  • @utrust_user
  • @hotho
  • @dblp
@jaeschke's tags highlighted