Improve dec_digits performance and benchmarking methodology #250
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note to reviewers: This pull request consists of 2 commits, so view separately for easier reviewing.
The first commit changes how dec_digits is benchmarked (i.e. using randomly generated numbers).
In this commit, you may notice K=7 for the benchmarks. This means that 1 to 7 digit random numbers will be used for benchmarking i.e. [1, 10^7). The reasoning for that is because I think very large numbers (>7 digits) are much less common in production.
BEFORE commit (baseline results):
Note that the results indicate dec_digits_less is faster than than the dec_digits_lib by 1.05x (i.e. 5%)
AFTER commit (K=7):
Now, the benchmark shows dec_digits_lib is faster than dec_digits_less by 1.3x when dealing with integers up to 10^7.
AFTER commit (K=19).
Adding K=19 benchmark results here if anyone is curious how dec_digits performs when dealing with integers of all digit sizes possible (for int64).
But when dealing all integers up to 10^19, dec_digits_less takes the lead again -- 10% faster than it's rival.
The second commit provides a faster implementation of dec_digits along with support for negative integers.
Results with new implementation (K=7):
The previous dec_digits_lib is now dec_digits_branch.
New dec_digits unsigned overload is about 9.6x faster than previous.
And the signed overload is about 7.6x faster.
Results with new implementation (K=19):
Unsigned overload about 16x faster.
Signed overload about 12.8x faster