-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sudhirtumati implementation #598
sudhirtumati implementation #598
Conversation
Produces incorrect output for the 1B file:
|
This is the output of test execution on my environment. I do not see Could you please make |
That file is 13 G, so it's a bit hard to share. You can create it yourself using create_measurements.sh (for the standard eval file) and create_measurements3.sh (for the 10K key set file). Your implementation must show the same output for those as the base line (or compare to output of the top of the leaderboard who are known to be correct and will complete much faster). |
Noted @gunnarmorling Unfortunately, I couldn't reproduce the error in my environment. The output generated by my implementation is identical with the baseline. I am sure there must be an issue with the implementation as the same code is not working in your environment. I suspect thread contention might be resulting in incorrect map updates. Modified my implementation to reduce thread contention to the extent possible. Total processing time (with 1B rows) also came down from ~40 seconds to ~30 seconds on my personal laptop. Test suite execution is successful. In addition to the test suite, I ran tests with multiple files with different row counts (.5m, 1m, 10m, 50, 100, 500m, 1B) and found that the results are identical to the baseline output. |
Hey, can you please rebase this to latest main and squash everything into one commit? There should be no unrelated commits in this PR. |
ab513a7
to
1a14699
Compare
Done. Please check |
00:25.064. |
Check List:
./test.sh <username>
shows no differences between expected and actual outputs)calculate_average_<username>.sh
(make sure to match casing of your GH user name) and is executablecalculate_average_baseline.sh
My PC configuration: