You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FYI I did some comprehensive testing of Python-available character alignment packages roughly 5 yrs ago, and the most important result was that you absolutely cannot trust but need to test them, thoroughly. Issues:
robustness (crashes on rare circumstances or large problems)
stack limitation (i.e. bad implementation of the recursion)
average/benign performance
worst-case memory and time complexity
whether or not you can get a coarse, cheap estimate first - or can pass some upper limit on the distance
whether or not the actual alignment (or just the score) is returned
whether or not multiple (or just the single-best) result is returned
whether or not combining codepoints are glued to the base characters (i.e. grapheme clusters), or are treated as independent (in the alignment and in the distance metric)
whether and how efficient and parallel 1:n or even n:m variants are implemented
whether or not custom weights or cost functions can be passed in (but we could always make that a post-processing step as long as we get the alignment path itself)
Tools besides rapidfuzz and jellyfish you might want to list:
In the QA specs we claim that we provide a recommendation for the edit distance algorithm to be used.
For this we have to
The text was updated successfully, but these errors were encountered: