-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-deterministic even with "deterministic=True" "seed=0" and the same number of threads in LightGBM==3.1.1 #3761
Comments
@shiyu1994 can you help to check this? it may be related to the bugs you fixed recently. |
@ZhangTP1996 The non-deterministic behavior comes from col-wise and row-wise histogram construction strategy. A quick fix to get a deterministic behavior is to set However, ideally col-wise and row-wise should produce the same result, I'll continue to look into this. |
Thanks for the rapid response. I will check this tomorrow. |
It seems that setting |
This problem is a numerical issue. Since row-wise and col-wise accumulate gradient values in different ways, when the values of gradients and hessians are quite small, the resultant histogram will have slight differences. Following is the histogram values from the first tree where
With Considering the numerical problem, I think a quick solution to provide deterministic behavior is to force using |
Also, I noticed that during training, the training loss vibrates dramatically with the provided data (starting from iteration 38). May be we need to increase the numerical stability.
|
Closed via #4027. |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
LightGBM component:
Environment info
Operating System: Linux
CPU/GPU model: CPU
Python version: 3.7.3
LightGBM version or commit hash: 3.1.1, installed by pip
Error message and / or logs
Reproducible example(s)
data.zip
The data to reproduce is attached in the zip file. Please fill in the "path" and run the code.
The text was updated successfully, but these errors were encountered: