-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training on GPU does not utilise GPU properly #836
Comments
same thing here, most of time GPU usage stays at 0% with some spike ~10% to 20%. I though my CPU( i5 2500K @ 4G) caused bottle necking my GPU performance and from your case I believe it is not the case. |
It is normal if your training data is small. |
I am using a dataset that have ~14Mil row and ~1000 sparse features the memory foot print as you can notice in the |
@mhmoudr |
As I am writing this I am running using |
@mhmoudr Sparse features processing has too much irregularity and are currently not accelerated on GPU. Try to set |
I had the same problem. Changing num_threads option from cpu_cores (in my case 16) to and revert to 6cc1dd9 revision 1 solved the problem. BTW, it's bug and should be fixed. UPD: |
Mainly the issue is GPU utilisation. as after building LightGBM for GPU up to the described process, and running on a sample dataset, while monitoring both CPU and GPU (please check attached screen shot)
The process move the training dataset into GPU memory and nvidia-smi recognise it as a running process, while during the training GPU utilisation does not exceed 2-5%, on the other hand CPU seems to be fully utilised.
I am not sure if this is a defect of a kind of incomplete implementation.
Environment info
Operating System: Ubuntu 16
CPU: 2 Xeon (total 48 Cores )
C++ version: latest C++ and calling the CLI process
Error Message:
N/A
Steps to reproduce
Compile for GPU using the provided docs.
use the following config :
data = "/path/to/libsvm/file"
num_iterations = 3000
learning_rate = 0.01
max_depth = 12
device = gpu
gpu_platform_id = 0
gpu_device_id = 0
The text was updated successfully, but these errors were encountered: