Hacker News new | ask | show | jobs
by grantwu 2234 days ago
Can you point to a source that defines Levenstein distance as only referring to bitstreams?

A translation of the original article [1] that introduced the concept notes in a footnote that "the definitions given below are also meaningful if the code is taken to mean an arbitrary set of words (possibly of different lengths) in some alphabet containing r letters (r >= 2)".

And if you wish to strictly stick to how it was originally defined, you'd need to only use strings of the same length.

More recent sources [2] say instead "over some alphabet", and even in the first footnote, describe results for "arbitrarily large alphabets"!

[1] https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf

[2] https://arxiv.org/pdf/1005.4033.pdf

1 comments

And Unicode is the biggest alphabet haha.