Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cpu_count handling for R benchmarks #130

Open
alistaire47 opened this issue Feb 1, 2023 · 0 comments
Open

Fix cpu_count handling for R benchmarks #130

alistaire47 opened this issue Feb 1, 2023 · 0 comments

Comments

@alistaire47
Copy link
Contributor

alistaire47 commented Feb 1, 2023

Currently for R benchmarks, this repo passescpu_count = NULL to run_one() (code), which then does not set the number of CPUs or threads anywhere (it omits that part of the script it creates). When run through higher-level arrowbench interfaces, cpu_count = NULL gets translated by get_default_parameters() to c(1L, parallel::detectCores()), which would create two cases for run_one(), which would be a problem.

In practice, not calling arrow:::SetCpuThreadPoolCapacity() means we're running with the default, which is the number of cores on the machine (pyarrow.cpu_count()). We should move to specifying this and recording it in tags. Right now the cpu_count key is in tags, but the value is empty. Changing this will break histories, but we should be able to adjust old records based on machine_info.cpu_core_count or machine_info.cpu_thread_count (I'm not exactly sure which we want, but they may not differ for any of the machines we're running on anyway).

Because of the shift to running arrowbench directly from arrow-benchmarks-ci, it may be more pragmatic to break things as we switch over and then do the cleanup, but I'm opening this issue here because the problem is presently here, even if the fix ends up being some tweaks in arrowbench defaults and some database cleanup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant