@s_nkeha

A Measure of Similarity Between Graph Vertices: Applications to Synonym Extraction and Web Searching

, , , , and . SIAM Rev., 46 (4): 647--666 (April 2004)
DOI: 10.1137/S0036144502415960

Abstract

We introduce a concept of similarity between vertices of directed graphs. Let G<font size="-1">A and G<font size="-1">B be two directed graphs with, respectively, n<font size="-1">A and n<font size="-1">B vertices. We define an n<font size="-1">B n<font size="-1">A similarity matrix S whose real entry s<font size="-1">ij expresses how similar vertex j (in G<font size="-1">A) is to vertex i (in G<font size="-1">B): we say that s<font size="-1">ij is their similarity score. The similarity matrix can be obtained as the limit of the normalized even iterates of S<font size="-1">k+1 = BS<font size="-1">kA<font size="-1">T + B<font size="-1">TS<font size="-1">kA, where A and B are adjacency matrices of the graphs and S<font size="-1">0 is a matrix whose entries are all equal to 1. In the special case where G<font size="-1">A = G<font size="-1">B = G, the matrix S is square and the score s<font size="-1">ij is the similarity score between the vertices i and j of G. We point out that Kleinberg's "hub and authority" method to identify web-pages relevant to a given query can be viewed as a special case of our definition in the case where one of the graphs has two vertices and a unique directed edge between them. In analogy to Kleinberg, we show that our similarity scores are given by the components of a dominant eigenvector of a nonnegative matrix. Potential applications of our similarity concept are numerous. We illustrate an application for the automatic extraction of synonyms in a monolingual dictionary.

Description

A Measure of Similarity between Graph Vertices

Links and resources

Tags

community

  • @mkroell
  • @cabird
  • @arath
  • @dblp
  • @s_nkeha
  • @folke
  • @dbenz
@s_nkeha's tags highlighted