BibSonomy :: bibtex  ::

tag user group author concept BibTeX key search:all search:tmalsburg
A blue social bookmark and publication sharing system.
tags · relations · groups · popular
help · blog · about
login · register
tmalsburg's BibTeX entry:  

A Bit-Vector Algorithm for Computing Levenshtein and Damerau Edit Distances.

Nord. J. Comput., 10(1): 29-39, 2003.
Authors: Heikki Hyyrö
URL: http://dblp.uni-trier.de/db/journals/njc/njc10.html#Hyyro03
Description: about Levenshtein and Damerau editdistanz from dblp
Tags: B_scanpathsimilarity algorithm article editdistance stringsimilarity toread
Abstract: The edit distance between strings A and B is defined as the minimum number of edit operations needed in converting A into B or vice versa. The Levenshtein edit distance allows three types of operations: an insertion, a deletion or a substitution of a character. The Damerau edit distance allows the previous three plus in addition a transposition between two adjacent characters. To our best knowledge the best current practical algorithms for computing these edit distances run in time O(dm) and O(⌈m/w⌉(n + σ)), where d is the edit distance between the two strings, m and n are their lengths (m ≤ n), w is the computer word size and σ is the size of the alphabet. In this paper we present an algorithm that runs in time O(⌈d/w⌉m + ⌈n/w⌉σ) or O(⌈d/w⌉n + ⌈m/w⌉σ). The structure of the algorithm is such, that in practice it is mostly suitable for testing whether the edit distance between two strings is within some pre-determined error threshold. We also present some initial test results with thresholded edit distance computation. In them our algorithm works faster than the original algorithm of Myers.
| URL | BibTeX  
@article{journals/njc/Hyyro03,
title = {A Bit-Vector Algorithm for Computing Levenshtein and Damerau Edit Distances.},
author = {Heikki Hyyrö},
journal = {Nord. J. Comput.},
number = {1},
pages = {29-39},
url = {http://dblp.uni-trier.de/db/journals/njc/njc10.html#Hyyro03},
volume = {10},
year = {2003},
description = {about Levenshtein and Damerau editdistanz from dblp},
abstract = {The edit distance between strings A and B is defined as the minimum number of edit operations needed in converting A into B or vice versa. The Levenshtein edit distance allows three types of operations: an insertion, a deletion or a substitution of a character. The Damerau edit distance allows the previous three plus in addition a transposition between two adjacent characters. To our best knowledge the best current practical algorithms for computing these edit distances run in time O(dm) and O(⌈m/w⌉(n + σ)), where d is the edit distance between the two strings, m and n are their lengths (m ≤ n), w is the computer word size and σ is the size of the alphabet. In this paper we present an algorithm that runs in time O(⌈d/w⌉m + ⌈n/w⌉σ) or O(⌈d/w⌉n + ⌈m/w⌉σ). The structure of the algorithm is such, that in practice it is mostly suitable for testing whether the edit distance between two strings is within some pre-determined error threshold. We also present some initial test results with thresholded edit distance computation. In them our algorithm works faster than the original algorithm of Myers. },
ee = {http://www.cs.helsinki.fi/njc/References/bollobas2001:409.html},
keywords = {B_scanpathsimilarity algorithm article editdistance stringsimilarity toread }
}