Use Rob's cool vmstat/gnuplot tooling to plot hardware utilization #302

mikemccand · 2024-10-02T11:16:35Z

@rmuir has some cool tooling that measures how effectively processes are using CPU/RAM/IO. Let's turn this on at least during indexing phases of nightly benchy, when we are trying to fully saturate the available hardware resources...

rmuir · 2024-10-02T12:17:55Z

@mikemccand to start with "text mode", just do this:

start of benchy:

$ nohup vmstat --active --wide --timestamp --unit M 1 > vmstat.log 2>/dev/null &

end of benchy:

$ pkill vmstat

it should produce human readable log looking like this:

router:~$  vmstat --active --wide --timestamp --unit M 1
--procs-- -----------------------memory---------------------- ---swap-- -----io---- -system-- ----------cpu---------- -----timestamp-----
   r    b         swpd         free        inact       active   si   so    bi    bo   in   cs  us  sy  id  wa  st  gu                 EDT
   1    0            0         3177           89          140    0    0     2     6  277    0   0   0 100   0   0   0 2024-10-02 08:12:00
   0    0            0         3177           89          140    0    0     0     0  323  161   0   0 100   0   0   0 2024-10-02 08:12:01
   1    0            0         3177           89          140    0    0     0   956  373  222   0   0 100   0   0   0 2024-10-02 08:12:02
   0    0            0         3177           89          140    0    0     0     0   93   84   0   0 100   0   0   0 2024-10-02 08:12:03
   0    0            0         3177           89          140    0    0     0     8  170  134   0   0 100   0   0   0 2024-10-02 08:12:04

We can then add fancy graph plot to "end of benchy", but the text mode itself is a good way to start?

rmuir · 2024-10-02T12:50:14Z

also if you don't want output in your local time, set TZ variable:

$ TZ=UTC vmstat ...

if you don't want inactive/active remove the --active flag. I prefer this way of breaking down the memory, but maybe for lucene, you'd prefer to just monitor OS cache instead:

$ TZ=UTC vmstat --wide --timestamp --unit M 1
--procs-- -----------------------memory---------------------- ---swap-- -----io---- -system-- ----------cpu---------- -----timestamp-----
   r    b         swpd         free         buff        cache   si   so    bi    bo   in   cs  us  sy  id  wa  st  gu                 UTC
   1    0            0        14480          178         1890    0    0    17    84  628    3   3   1  95   0   0   0 2024-10-02 12:49:15
   1    0            0        14485          178         1880    0    0     0     0  391  733   0   2  98   0   0   0 2024-10-02 12:49:16
   0    0            0        14494          178         1875    0    0     0     0  433  784   1   1  98   0   0   0 2024-10-02 12:49:17
   0    0            0        14506          178         1873    0    0     0     0  374 1083   3   1  96   0   0   0 2024-10-02 12:49:18

mikemccand · 2024-10-09T14:03:43Z

OK! I did the first step, which is to run vmstat during each indexing run and save away the resulting log. Yesterday's run produced these exciting vmstat logs:

beast3:tmp$ wc -l *vmstat*
    120 fastIndexBigDocs.vmstat.log
    148 fastIndexMediumDocs.vmstat.log
   3903 fastIndexMediumDocsWithVectorsQuantized.vmstat.log
   6416 fastIndexMediumDocsWithVectors.vmstat.log
  20995 fixedIndex.vmstat.log
    240 nrtIndexMediumDocs.vmstat.log
  31822 total

Only 120 lines (120 seconds) for fastIndexBigDocs.vmsat.log!

Next we need to somehow gnuplot these buggers?

rmuir · 2024-10-09T16:21:42Z

thanks @mikemccand! can you hook me up with these sample data files? it will be faster to make it look decent.

mikemccand · 2024-10-09T16:40:59Z

Does this work? vmstat-logs.zip

This is all the vmstat logs, each from building the different N indices that nightly benchy builds ...

rmuir · 2024-10-09T16:56:16Z

thank you @mikemccand, that works. crazy how idle your CPU is while doing this.

rmuir · 2024-10-10T04:49:29Z

@mikemccand see https://rmuir.org/vmstat-report/

You can grab the 3 files there to reproduce it:

vmstat.gpi: gnuplot file to make several .svg files
index.html.template: simple html file to lay them out
generate.sh: script that makes directory with these files for each log file

Seems the hardware is not fully utilized. There is a direct correlation to the CPU usage with the number of process (=threads here). I guess you have some insane number of cores?

mikemccand · 2024-10-10T10:36:01Z

Whoa, thanks @rmuir! Nice charts! Yeah, this is beast3 with 128 cores, and one of those indexing runs (the fixedIndex one) uses only one core (mostly) so that it generates a precise deterministic index each night so the search results are "comparable". But for the other runs, I think I expect them to be saturating CPU (ish?) ... maybe I'm not using enough concurrency for them!

rmuir · 2024-10-10T11:04:26Z

This makes sense. These are just a start, we can tweak/beautify later once things are working.

I struggled a bit because SVG files were enormous out of box: the program is written by scientists, so SVG out of box were totally lossless and accurately representing all 20,000+ datapoints in your fixedIndex, which made for huge sizes.

I had not done interpolation before so had to figure it out, I tried to add comments so you can tweak any of this if you want.

mikemccand · 2024-10-11T13:29:33Z

Last night's benchy made the nice gnuplot vmstat charts!

It's linked off the nightly details page, to here. Each sub-directory has the charts for building that one index, e.g. fastIndexMediumDocs. It is quite depressing how idle the CPUs actually are, especially at the end of the run. We are not waiting for merges, so I suspect this is just Lucene moving the (big) in-memory segments to disk ... I'll see if I can better tune these runs to saturate the hardware.

rmuir · 2024-10-11T14:42:36Z

We are not waiting for merges, so I suspect this is just Lucene moving the (big) in-memory segments to disk ... I'll see if I can better tune these runs to saturate the hardware.

@mikemccand This could be whats happening. Normally, I'd expect to see some iowait here for something like that (maybe, iowait is tricky like that).

But there could be a challenge: you have 128 cpus, so if e.g. we have single cpu in iowait, we may not see it represented in the vmstat output, because it is not even 1%?

mikemccand added a commit that referenced this issue Oct 10, 2024

#302: add vmstat charts to each nightly run

ee0e682

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Rob's cool vmstat/gnuplot tooling to plot hardware utilization #302

Use Rob's cool vmstat/gnuplot tooling to plot hardware utilization #302

mikemccand commented Oct 2, 2024

rmuir commented Oct 2, 2024

rmuir commented Oct 2, 2024

mikemccand commented Oct 9, 2024

rmuir commented Oct 9, 2024

mikemccand commented Oct 9, 2024

rmuir commented Oct 9, 2024

rmuir commented Oct 10, 2024

mikemccand commented Oct 10, 2024

rmuir commented Oct 10, 2024

mikemccand commented Oct 11, 2024

rmuir commented Oct 11, 2024

Use Rob's cool vmstat/gnuplot tooling to plot hardware utilization #302

Use Rob's cool vmstat/gnuplot tooling to plot hardware utilization #302

Comments

mikemccand commented Oct 2, 2024

rmuir commented Oct 2, 2024

rmuir commented Oct 2, 2024

mikemccand commented Oct 9, 2024

rmuir commented Oct 9, 2024

mikemccand commented Oct 9, 2024

rmuir commented Oct 9, 2024

rmuir commented Oct 10, 2024

mikemccand commented Oct 10, 2024

rmuir commented Oct 10, 2024

mikemccand commented Oct 11, 2024

rmuir commented Oct 11, 2024