Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to limit the number of cores used for training #6

Open
hannaboe opened this issue Apr 7, 2020 · 5 comments
Open

How to limit the number of cores used for training #6

hannaboe opened this issue Apr 7, 2020 · 5 comments

Comments

@hannaboe
Copy link

hannaboe commented Apr 7, 2020

I'm training a model on a supercomputer and since train() is using all cores I would like to limit the number of cores used. I tried to set num_cores = 20 but that doesn't change anything and I still use all cores.
Is there another way to limit the number of cores used when running train?

@mikeyEcology
Copy link
Owner

If you set num_cores =1 it will limit the number of cores used to 1.

@hannaboe
Copy link
Author

hannaboe commented Apr 7, 2020

I set num_cores = 1 but it still uses all 72 cores.

@mikeyEcology
Copy link
Owner

It could be that your HPC is treating these commands differently, because HPCs run a little differently than standard computers. Are you using a GPU? If so, how many.
Also, if you want to run non-interactively on your HPC, you could set print_cmd=TRUE, and then the function will provide a command you can submit as a job in your HPC; this is usually a good idea for longer runs. Using this you could set up a job with a specific number of threads (cores), which would limit how many the function is running.

@JoejynWan
Copy link

Hi @mikeyEcology
I am facing a similar issue with not being able to limit the number of cores and switching to GPU. I am running train() on a computer with AMD Ryzen 7 3700X (8-core, 16 threads) and NVIDIA GeForce RTX 2080 Super.
I have tried a combination of num_cores = 1, num_cores = 10 and num_gpus = 1 and num_gpus = 2. I have also tried both methods of running it through R and setting print_cmd = T and submitted the job via terminal. In all cases, train() still uses all 16 cores and runs on CPU instead of GPU.
Am I missing some inputs? Thank you!

@mikeyEcology
Copy link
Owner

The issue here is the release of tensorflow; @hannaboe this will probably help your problem as well. The installation of tensorflow is different if you are using a gpu, so when you install it, you should use pip install tensorflow-gpu==1.14. Running this will overwrite the installation that does not use the gpu.
More details can be found here. Note, from this link, that the version of tensorflow you install (in this example 1.14) will depend on the type of driver you have for your gpu.
I'll add an explanation about this to the readme file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants