Function File: [dist, L] = editdistance (str1, str2)
Function File: [dist, L] = editdistance (str1, str2, weights)
Function File: [dist, L] = editdistance (str1, str2, weights, modus)

Compute the Levenshtein edit distance between the two strings.

The optional argument weights specifies weights for the deletion, matched, and insertion operations; by default it is set to +1, 0, +1 respectively, so that a least editdistance means a closer match between the two strings. This function implements the Levenshtein edit distance as presented in Wikipedia article, accessed Nov 2006. Also the levenshtein edit distance of a string with the empty string is defined to be its length.

For the special case that there are no weights given and the array L is not requested, an algorithm of Berghel and Roach, which improves an algorithm introduced by Ukkonen in 1985, will be applied. This algorithm is significantly faster most of the times. Its main strength lies in cases with small edit distances, where huge speedups and memory savings are suspectible. The time (and space) complexity is O(((dist^2 - (n - m)^2)/2) + dist), where n and m denote the length of both strings.

The optional argument modus specifies the algorithm to be used. For modus = 0, Berghel and Roach’s algorithm will be used whenever possible. For modus = 1, the classic algorithm by Fisher and Wagner will be used. If L is omitted, and modus = 1, a variant of Fisher and Wagner’s algorithm using only a linear amount of memory with respect to the input length, but O(m*n) runtime, will be used. Again, n and m denote the length of both strings.

The default return value dist is the edit distance, and the other return value L is the distance matrix.

editdistance ('marry', 'marie') 
  ⇒  2

Package: strings