Levenshtein
- class py_stringmatching.similarity_measure.levenshtein.Levenshtein[source]
Computes Levenshtein measure (also known as edit distance).
Levenshtein distance computes the minimum cost of transforming one string into the other. Transforming a string is carried out using a sequence of the following operators: delete a character, insert a character, and substitute one character for another.
- get_raw_score(string1, string2)[source]
Computes the raw Levenshtein distance between two strings.
- Parameters
string1 (str) – Input strings.
string2 (str) – Input strings.
- Returns
Levenshtein distance (int).
- Raises
TypeError – If the inputs are not strings.
Examples
>>> lev = Levenshtein() >>> lev.get_raw_score('a', '') 1 >>> lev.get_raw_score('example', 'samples') 3 >>> lev.get_raw_score('levenshtein', 'frankenstein') 6
- get_sim_score(string1, string2)[source]
Computes the normalized Levenshtein similarity score between two strings.
- Parameters
string1 (str) – Input strings.
string2 (str) – Input strings.
- Returns
Normalized Levenshtein similarity (float).
- Raises
TypeError – If the inputs are not strings.
Examples
>>> lev = Levenshtein() >>> lev.get_sim_score('a', '') 0.0 >>> lev.get_sim_score('example', 'samples') 0.5714285714285714 >>> lev.get_sim_score('levenshtein', 'frankenstein') 0.5