Update speed benchmarks #8

annaveronika · 2020-05-13T16:38:37Z

We are happy to contribute here.

We suggest following updates:

using newest versions of the libraries
using 1000 iterations instead of 500, because when you are using 500 iterations preprocessing might be a bottleneck, which is not what you want to measure. Plus you are usually using GPU-s for large datasets where it's not enough to run for 500
using two different aws configurations, one with 8 V100 another without GPU-s for running on CPU. It is cheaper this way, and you don't need to pay for GPU-s when you are not using them
run 5 times every train on CPU, because for all the libraries CPU time might differ by up to 30% from run to run. So the benchmark will contain average time and standard deviation for CPU

Are you OK with these changes?

RAMitchell · 2020-05-14T21:06:54Z

Seems reasonable for me. If we run 5 times, let's do it for all algorithms and have a configurable parameter.

annaveronika · 2020-05-15T10:46:43Z

Thanks a lot, it sounds great, we'll make a pr soon!

Provide feedback