Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Over 60 times slower than nvidia-smi to asses resource usage #42

Open
George3d6 opened this issue Mar 19, 2021 · 0 comments
Open

Over 60 times slower than nvidia-smi to asses resource usage #42

George3d6 opened this issue Mar 19, 2021 · 0 comments

Comments

@George3d6
Copy link

George3d6 commented Mar 19, 2021

Easiest way to replicate would be:

time:

import nvidia_smi
import numpy as np


nvidia_smi.nvmlInit()

for _ in range(50):
        gpus = [nvidia_smi.nvmlDeviceGetHandleByIndex(i) for i in range(nvidia_smi.nvmlDeviceGetCount())]
        res_arr = [nvidia_smi.nvmlDeviceGetUtilizationRates(handle) for handle in gpus]
        print('Usage with nivida-smi: ', np.sum([res.gpu for res in res_arr]), '%')

Then time:

import GPUtil
import numpy as np

for _ in range(50):
        res_arr = GPUtil.getGPUs()
        print('Usage with GPUtil: ', np.sum([res.load for res in res_arr])*100, '%')

YMMV here but for the first one I get constant reports of 1% GPU utilization and runtime is:

real    0m0,179s
user    0m0,688s
sys     0m0,818s

For the second one GPU utilization climb to a whooping 93% by the 6th call and the runtime is:

real    0m11,267s
user    0m0,605s
sys     0m11,449s

The getGPUs() seems to be fairly close to what nvidia SMI does with nvmlDeviceGetUtilizationRates, and quite frankly it being 63x times slower and consuming ~100% of my GPU (2080RTX) to run, as opposed to 1% seems a bit unreasonable.

Since may people use this library to figure out GPU utilization it might be reasonable to try and have a more efficient version of getGPUs for that or, if it provides some "extra" features (e.g. it samples 100x calls and average them out) a way to control the settings on that might be welcome.

Or maybe I'm doing something completely wrong here, in which case, let me know.

@George3d6 George3d6 changed the title Extremely slow compared to nvidia-smi Over 60 times slower than nvidia-smi to asses resource usage Mar 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant