Techreport,

Fast shortest path distance estimation in large networks

, , , and .
(August 2008)

Abstract

We study the problem of preprocessing a large graph so that single-pair shortest-path queries can be answered quickly at runtime. Computing shortest paths is a well studied problem, but exact algorithms do not scale well to real-world huge graphs in applications that require very short response time.We focus on approximate methods for distance estimation, in particular in landmarks-based distance indexing. This approach involves choosing some nodes as landmarks and computing (off-line), for each node in the graph its embedding, i.e., the vector of its distances from all the landmarks. At runtime, when the distance between a pair of nodes is needed, we quickly estimate it by combining the embeddings of the two nodes. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. We develop and experimentally compare a large number of simple heuristics that scale well to large graphs. For a given target accuracy, our techniques require far less space than random landmark selection. We demonstrate the robustness of our techniques experimentally using five different real world graphs having tens of millions of edges.We study an application of our method in two problems arising naturally in large-scale graphs, social search and community detection.

Tags

Users

  • @chato

Comments and Reviews