How to use GPU correctly #133

tjb-tech · 2021-03-18T13:37:27Z

Hello, could you please tell me how to start the GPU of this project? I tried '--use_gpu' in the command line, but did not open the GPU. I hope you can answer my question as soon as possible after you see it. Thank you very much

corneliusboehm · 2021-03-18T14:08:46Z

Hi @tjb-tech! First of all, is your GPU readily set up with CUDA? You can verify that by running nvidia-smi and checking the reported CUDA version.

If you enable the --use_gpu option on the train_classifier.py script, it will automatically use the GPU at index 0 for training and nvidia-smi should list a new python process.

tjb-tech · 2021-03-20T11:08:17Z

Hi @tjb-tech! First of all, is your GPU readily set up with CUDA? You can verify that by running nvidia-smi and checking the reported CUDA version.

If you enable the --use_gpu option on the train_classifier.py script, it will automatically use the GPU at index 0 for training and nvidia-smi should list a new python process.

First of all, thank you very much for your timely reply to us. Through your method, we have checked that nvidia-smi does have a new line of Python process. However, I found that the CPU utilization rate was close to 100%, while the GPU utilization rate was close to 1%. Could you tell me why? I hope to get your professional reply. Thank you very much

corneliusboehm · 2021-03-22T12:18:01Z

The problem with video datasets is that loading and decoding of the videos can get expensive. So it can happen that the update step of the model on the GPU is done faster than the preparation of the next batch, which leads to the CPU being utilized more than the GPU.
However, a utilization of 1% is really low. Could you verify with the following command during training if the utilization is constantly that low or if there are at least some periodic spikes?

watch -n 1 nvidia-smi

And could you send over your CPU and GPU specs?

tjb-tech · 2021-03-22T14:55:09Z

The problem with video datasets is that loading and decoding of the videos can get expensive. So it can happen that the update step of the model on the GPU is done faster than the preparation of the next batch, which leads to the CPU being utilized more than the GPU.
However, a utilization of 1% is really low. Could you verify with the following command during training if the utilization is constantly that low or if there are at least some periodic spikes?
watch -n 1 nvidia-smi
And could you send over your CPU and GPU specs?

First of all, thank you very much for taking time out of your busy schedule to answer my questions. My CPU is I5-9300H, and my GPU is GTX1650. The screenshots of SMI before and after operation are as follows. The running status of my CPU and GPU is as follows. Thank you again for your prompt and enthusiastic reply. Looking forward to hearing from you soon

corneliusboehm · 2021-03-22T17:05:53Z

Thanks for the info! It looks like the Python process is allocating some memory on the GPU, which is a good sign. Do you see any other output on the console of the epochs being finished? And do you get a resulting checkpoint?

Generally, does training in PyTorch work for you in other projects?
One more thing you could check is the following:

import torch
torch.cuda.is_available()

I must admit that we haven't tested our code on Windows in a while, so there might also be a platform-related issue.

corneliusboehm · 2021-04-06T09:02:32Z

Hey @tjb-tech, have you been able to resolve your issue?

tjb-tech · 2021-04-06T12:08:03Z

Hey @tjb-tech, have you been able to resolve your issue?

Thank you very much for your concern and I'm sorry for not replying to you in time. We tried using torch.cuda.is_available(), and the return value is true, but the GPU and CPU usage are still the same as before, so I think it may be a system compatibility issue, which you can do some further research on. By the way, my system is Windows 10.

corneliusboehm · 2021-04-06T13:16:12Z

Thanks for the update. Do you still get a checkpoint after training and how long does it take? Because if that generally works, I would go ahead and close this issue for now.

tjb-tech · 2021-04-06T13:25:30Z

Thanks for your reply. That's all about the GPU problem for the time being. I'm still trying to run your sense_studio code but we encountered the first error:

* Serving Flask app "sense_studio" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: on
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 105-309-328
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

I tried the following scenario

from gevent import pywsgi
if __name__ == '__main__':
Server = pywsgi.WSGIServer(('0.0.0.0', 5000), app)
server.serve_forever()

Here we go
But again, I encountered the following problems

[2021-04-06 13:19:30,225] ERROR in app: Exception on / [GET]
Traceback (most recent call last):
  File "D:\Anaconda\envs\sense\lib\site-packages\flask\app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "D:\Anaconda\envs\sense\lib\site-packages\flask\app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "D:\Anaconda\envs\sense\lib\site-packages\flask\app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "D:\Anaconda\envs\sense\lib\site-packages\flask\_compat.py", line 39, in reraise
    raise value
  File "D:\Anaconda\envs\sense\lib\site-packages\flask\app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "D:\Anaconda\envs\sense\lib\site-packages\flask\app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "D:/MyDocuments/Service Outsourcing/sense/tools/sense_studio/sense_studio.py", line 46, in projects_overview
    project['exists'] = os.path.exists(project['path'])
TypeError: 'bool' object is not subscriptable
127.0.0.1 - - [2021-04-06 13:19:30] "GET / HTTP/1.1" 500 490 0.004986
127.0.0.1 - - [2021-04-06 13:19:30] "GET /favicon.ico HTTP/1.1" 404 420 0.000999

corneliusboehm · 2021-04-06T13:34:53Z

* Serving Flask app "sense_studio" (lazy loading)
* Environment: production
  WARNING: This is a development server. Do not use it in a production deployment.
  Use a production WSGI server instead.
* Debug mode: on
* Restarting with stat
* Debugger is active!
* Debugger PIN: 105-309-328
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

This is a regular output and not relevant when running this application locally. You don't need to worry about setting up a WSGI server.

If the second error persists, I will have to take a look at that though 😕 Can you already send me the contents of your sense/tools/sense_studio/projects_config.json?

tjb-tech · 2021-04-06T13:43:03Z

* Serving Flask app "sense_studio" (lazy loading)
* Environment: production
  WARNING: This is a development server. Do not use it in a production deployment.
  Use a production WSGI server instead.
* Debug mode: on
* Restarting with stat
* Debugger is active!
* Debugger PIN: 105-309-328
* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
This is a regular output and not relevant when running this application locally. You don't need to worry about setting up a WSGI server.

If the second error persists, I will have to take a look at that though 😕 Can you already send me the contents of your sense/tools/sense_studio/projects_config.json?

Of course, my file looks like this

corneliusboehm · 2021-04-06T14:08:24Z

Very interesting. This is either a very outdated format or an error occurred. Anyway, I would recommend deleting this file and trying again. Also you might want to pull our latest master branch, as we've recently added a few improvements.

tjb-tech · 2021-04-07T13:38:56Z

Very interesting. This is either a very outdated format or an error occurred. Anyway, I would recommend deleting this file and trying again. Also you might want to pull our latest master branch, as we've recently added a few improvements.

Thank you so much for your timely help. We have opened Sense Studio and created our own project. We have also uploaded our own data, but we can't click the Training button, the browser shows JavaScript :void(0); , as shown in the figure below

I would appreciate it very much if you could answer my questions

corneliusboehm · 2021-04-07T13:45:08Z

Yes, the training module has only been added a few days ago, so after pulling our latest updates this feature should be enabled for you.

tjb-tech · 2021-04-17T02:43:35Z

I am very sorry that I have not been able to continue to discuss this project with you recently due to my busy business. Your suggestion last time was very effective and I admire it very much. These two days, I reviewed your project again, and carefully read the blog on the 20BN official website. I noticed the following test screen in your demo video, which was very impressive.

I want to achieve this effect on my computer. Could you please tell me how the content of this test page is completed? Could you share this part? My heartfelt thanks in advance! Once again, I would like to express my admiration for your open source spirit.

guillaumebrg · 2021-04-21T14:14:30Z

Hey @tjb-tech, thank you for the kind words!

That specific demo which you found on our website is kind of old and wasn't obtained using sense. We haven't released this exact model with this exact set of classes. However, we've recently been working on providing a gesture control demo within sense which might do what you need. It's still work in progress (model weights haven't been released yet) but you can already have a look here: #149.

corneliusboehm · 2021-04-27T09:13:59Z

It looks like the original issue has been solved, so I'm going to close this thread now.
We're happy to keep supporting you, if more questions come up!

corneliusboehm closed this as completed Apr 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use GPU correctly #133

How to use GPU correctly #133

tjb-tech commented Mar 18, 2021

corneliusboehm commented Mar 18, 2021

tjb-tech commented Mar 20, 2021

corneliusboehm commented Mar 22, 2021

tjb-tech commented Mar 22, 2021 •

edited by corneliusboehm

Loading

corneliusboehm commented Mar 22, 2021

corneliusboehm commented Apr 6, 2021

tjb-tech commented Apr 6, 2021 •

edited by corneliusboehm

Loading

corneliusboehm commented Apr 6, 2021

tjb-tech commented Apr 6, 2021

corneliusboehm commented Apr 6, 2021

tjb-tech commented Apr 6, 2021

corneliusboehm commented Apr 6, 2021

tjb-tech commented Apr 7, 2021

corneliusboehm commented Apr 7, 2021

tjb-tech commented Apr 17, 2021

guillaumebrg commented Apr 21, 2021

corneliusboehm commented Apr 27, 2021

How to use GPU correctly #133

How to use GPU correctly #133

Comments

tjb-tech commented Mar 18, 2021

corneliusboehm commented Mar 18, 2021

tjb-tech commented Mar 20, 2021

corneliusboehm commented Mar 22, 2021

tjb-tech commented Mar 22, 2021 • edited by corneliusboehm Loading

corneliusboehm commented Mar 22, 2021

corneliusboehm commented Apr 6, 2021

tjb-tech commented Apr 6, 2021 • edited by corneliusboehm Loading

corneliusboehm commented Apr 6, 2021

tjb-tech commented Apr 6, 2021

corneliusboehm commented Apr 6, 2021

tjb-tech commented Apr 6, 2021

corneliusboehm commented Apr 6, 2021

tjb-tech commented Apr 7, 2021

corneliusboehm commented Apr 7, 2021

tjb-tech commented Apr 17, 2021

guillaumebrg commented Apr 21, 2021

corneliusboehm commented Apr 27, 2021

tjb-tech commented Mar 22, 2021 •

edited by corneliusboehm

Loading

tjb-tech commented Apr 6, 2021 •

edited by corneliusboehm

Loading