We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Current figures are based on quite small chunks, which may be hurting perf and storage benchmarks.
The text was updated successfully, but these errors were encountered:
I think this is hitting us at the upper end all right, where we're spending ~1/4 of the total time in the kernel:
num_samples num_sites tool threads user_time sys_time wall_time 11 10 116230 sgkit 1 6.70 0.30 6.900836 13 100 204714 sgkit 1 7.55 0.48 7.896185 15 1000 403989 sgkit 1 13.19 0.69 13.466672 17 10000 863998 sgkit 1 113.01 10.76 119.052356 19 100000 2365367 sgkit 1 2545.20 658.47 3085.060317 21 1000000 7254858 sgkit 1 76912.53 27060.23 99354.307909 num_samples num_sites tool threads user_time sys_time wall_time 10 10 116230 savvy 1 0.11 0.00 0.147215 12 100 204714 savvy 1 0.25 0.01 0.287616 14 1000 403989 savvy 1 1.31 0.02 1.364692 16 10000 863998 savvy 1 14.12 0.08 14.228865 18 100000 2365367 savvy 1 388.86 0.59 389.678411 20 1000000 7254858 savvy 1 8410.33 5.25 8418.529669
Presumably having fewer chunks would help here -- also we are getting warnings from Dask about sending a large graph.
Can you update with your findings re a reasonable choice of chunks size @benjeffery?
Sorry, something went wrong.
@benjeffery I'd like to rerun the vcf code over the break to get better chunks - can you document your findings about chunk size here please?
No branches or pull requests
Current figures are based on quite small chunks, which may be hurting perf and storage benchmarks.
The text was updated successfully, but these errors were encountered: