-
Notifications
You must be signed in to change notification settings - Fork 30
test: refactor BTreeMap benches to improve coverage and readability #285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big change that includes a few things. I wonder if we should try to at least separate the refactoring of benchmarks from adding the script (I understand the script helps but it's not strictly necessary for the refactoring of benchmarks).
Re the refactoring: I wonder if we should be more aggressive with what set of benchmarks we want to keep. The V1 version can be argued that it's the old version of BTreeMap
that ideally people shouldn't use anymore (of course we need to keep it for backward compatibility but I think as long as it works correctly, we don't necessarily need to make sure it's performant anymore).
The V2 version should cover all needs (from unbounded to bounded types, through the use of the right Bound
type). So, maybe we can live without V1
benchmarks and that should help reduce the number of them by a lot.
Re the script: adding a dedicated python script that someone needs to run to get some readable form of benchmarks sounds like we need to fix how benchmarks are reported in the first place. canbench
could be updated to report benchmarks in a more concise way, e.g. we only report improved/regressed but not list the ones that are within noise (it can already know what the diff is, so perhaps we should simply be more aggressive in what is presented in the end). Most of the time people would want to see the improvements/regressions anyway and we can always have some option that prints the full output if desired.
The benefit I see is that we wouldn't need to maintain a custom script (which has some maintenance overhead) and I would also like to avoid having to run some script that transforms the data in CSV so I can import them somewhere else to be able to read them. I believe that canbench
should be self-contained and presenting the benchmarks in a way that makes the most sense, not requiring people to perform post-processing so that they can actually analyze their benchmarks' results.
This PR refactors BTreeMap benchmarks to increase coverage and improve readability.
Current benchmark setup has some challenges:
New benchmark setup:
FixedVec
for v2)Still adding new benchmarks must be handled with care because pocket_ic has a limitation on the number of tests (10k) and the total length of test names (20k chars). This limit is going to be increased soon.
Getting closer to this limit increases running time up to 7 minutes and increases significantly the size of canbench report, which is currently is not very readable.
Therefore this PR also adds a
./benchmark/scripts/
with some scripts for extracting reports into CSV-format.