Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of Damerau Levenshtein is low #51

Closed
nanthony007 opened this issue Nov 28, 2022 · 3 comments
Closed

Performance of Damerau Levenshtein is low #51

nanthony007 opened this issue Nov 28, 2022 · 3 comments

Comments

@nanthony007
Copy link

nanthony007 commented Nov 28, 2022

I am unsure as to the cause of this. I know the DL algorithm is significantly more complex than the OSA algorithm but some crude benchmarks show the results below when simply comparing "alcohol" and "acloholism".

I just wanted to post this to ensure the algorithm has the correct implementation given this package's large usage rate.

Times below are average benchmarks in nanoseconds.

CleanShot 2022-11-28 at 09 29 52@2x

@maxbachmann
Copy link
Member

9d0a950 improves the performance significantly. E.g. for your strings I do now get the following runtimes on my machine:

test benches::bench_damerau_levenshtein            ... bench:         219 ns/iter (+/- 8)
test benches::bench_jaro_winkler                   ... bench:          63 ns/iter (+/- 1)
test benches::bench_levenshtein                    ... bench:          87 ns/iter (+/- 1)
test benches::bench_osa_distance                   ... bench:         136 ns/iter (+/- 3)
test benches::bench_sorensen_dice                  ... bench:         365 ns/iter (+/- 8)

while previously I would get:

test benches::bench_damerau_levenshtein            ... bench:         735 ns/iter (+/- 15)

@nanthony007
Copy link
Author

nanthony007 commented Jan 4, 2024

This is awesome thank you! Much more in line with what I was expecting perf wise. Will there be a new crates.io release for the functionality you've been adding?

@maxbachmann
Copy link
Member

Yes there will be a new release soon. Probably once we have #58 merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants