Over 60 times slower than nvidia-smi to asses resource usage #42

George3d6 · 2021-03-19T22:49:41Z

Easiest way to replicate would be:

time:

import nvidia_smi
import numpy as np


nvidia_smi.nvmlInit()

for _ in range(50):
        gpus = [nvidia_smi.nvmlDeviceGetHandleByIndex(i) for i in range(nvidia_smi.nvmlDeviceGetCount())]
        res_arr = [nvidia_smi.nvmlDeviceGetUtilizationRates(handle) for handle in gpus]
        print('Usage with nivida-smi: ', np.sum([res.gpu for res in res_arr]), '%')

Then time:

import GPUtil
import numpy as np

for _ in range(50):
        res_arr = GPUtil.getGPUs()
        print('Usage with GPUtil: ', np.sum([res.load for res in res_arr])*100, '%')

YMMV here but for the first one I get constant reports of 1% GPU utilization and runtime is:

real    0m0,179s
user    0m0,688s
sys     0m0,818s

For the second one GPU utilization climb to a whooping 93% by the 6th call and the runtime is:

real    0m11,267s
user    0m0,605s
sys     0m11,449s

The getGPUs() seems to be fairly close to what nvidia SMI does with nvmlDeviceGetUtilizationRates, and quite frankly it being 63x times slower and consuming ~100% of my GPU (2080RTX) to run, as opposed to 1% seems a bit unreasonable.

Since may people use this library to figure out GPU utilization it might be reasonable to try and have a more efficient version of getGPUs for that or, if it provides some "extra" features (e.g. it samples 100x calls and average them out) a way to control the settings on that might be welcome.

Or maybe I'm doing something completely wrong here, in which case, let me know.

The text was updated successfully, but these errors were encountered:

George3d6 changed the title ~~Extremely slow compared to nvidia-smi~~ Over 60 times slower than nvidia-smi to asses resource usage Mar 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Over 60 times slower than nvidia-smi to asses resource usage #42

Over 60 times slower than nvidia-smi to asses resource usage #42

George3d6 commented Mar 19, 2021 •

edited

Loading

Over 60 times slower than nvidia-smi to asses resource usage #42

Over 60 times slower than nvidia-smi to asses resource usage #42

Comments

George3d6 commented Mar 19, 2021 • edited Loading

George3d6 commented Mar 19, 2021 •

edited

Loading