You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently for R benchmarks, this repo passescpu_count = NULL to run_one() (code), which then does not set the number of CPUs or threads anywhere (it omits that part of the script it creates). When run through higher-level arrowbench interfaces, cpu_count = NULL gets translated by get_default_parameters() to c(1L, parallel::detectCores()), which would create two cases for run_one(), which would be a problem.
In practice, not calling arrow:::SetCpuThreadPoolCapacity() means we're running with the default, which is the number of cores on the machine (pyarrow.cpu_count()). We should move to specifying this and recording it in tags. Right now the cpu_count key is in tags, but the value is empty. Changing this will break histories, but we should be able to adjust old records based on machine_info.cpu_core_count or machine_info.cpu_thread_count (I'm not exactly sure which we want, but they may not differ for any of the machines we're running on anyway).
Because of the shift to running arrowbench directly from arrow-benchmarks-ci, it may be more pragmatic to break things as we switch over and then do the cleanup, but I'm opening this issue here because the problem is presently here, even if the fix ends up being some tweaks in arrowbench defaults and some database cleanup.
The text was updated successfully, but these errors were encountered:
Currently for R benchmarks, this repo passes
cpu_count = NULL
torun_one()
(code), which then does not set the number of CPUs or threads anywhere (it omits that part of the script it creates). When run through higher-level arrowbench interfaces,cpu_count = NULL
gets translated byget_default_parameters()
toc(1L, parallel::detectCores())
, which would create two cases forrun_one()
, which would be a problem.In practice, not calling
arrow:::SetCpuThreadPoolCapacity()
means we're running with the default, which is the number of cores on the machine (pyarrow.cpu_count()
). We should move to specifying this and recording it intags
. Right now thecpu_count
key is in tags, but the value is empty. Changing this will break histories, but we should be able to adjust old records based onmachine_info.cpu_core_count
ormachine_info.cpu_thread_count
(I'm not exactly sure which we want, but they may not differ for any of the machines we're running on anyway).Because of the shift to running arrowbench directly from arrow-benchmarks-ci, it may be more pragmatic to break things as we switch over and then do the cleanup, but I'm opening this issue here because the problem is presently here, even if the fix ends up being some tweaks in arrowbench defaults and some database cleanup.
The text was updated successfully, but these errors were encountered: