-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] respect parameter aliases for network params #3813
Conversation
Co-authored-by: Nikita Titov <[email protected]>
d43b27f
to
720d6e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving alias handling! Please check my comments:
@@ -352,6 +354,46 @@ def get(cls, *args): | |||
return ret | |||
|
|||
|
|||
def _choose_param_value(main_param_name: str, params: Dict[str, Any], default_value: Any) -> Dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I don't like _choose_param_value
name very much, but don't have alternatives to suggest. Maybe you have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this name very much and think it's descriptive of what is happening. Given a dictionary of parameters, it chooses a single value to use for a single parameter.
python-package/lightgbm/basic.py
Outdated
local_listen_port=params["local_listen_port"], | ||
listen_time_out=params.get("listen_time_out", 120), | ||
num_machines=params["num_machines"] | ||
) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove all aliases of machines
from params
. This break
prevents doing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I proposed something for this in 63387db
I think the cleanest way to do it is to use _choose_param_value()
to get a single value for machines
. Then we don't need a separate for
loop with a break
.
expected_params = { | ||
"local_listen_port": 1234, | ||
"port": 2222, | ||
"metric": "auc", | ||
"num_trees": 81 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think expected_params
can be gotten by copying original_params
at the very beginning of the test to not re-type every value again here.
expected_params = { | |
"local_listen_port": 1234, | |
"port": 2222, | |
"metric": "auc", | |
"num_trees": 81 | |
} | |
expected_params = copy(original_params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy()
performs a shallow clone, so some modifications to original_params
can still impact the copy. https://docs.python.org/3/library/copy.html#copy.copy
I think it's preferable for this test to re-type the literal values, so that the test doesn't get a false positive because of subtle mistakes where we assumed something was a copy but actually was a reference
Co-authored-by: Nikita Titov <[email protected]>
…o fix/param-aliases
Co-authored-by: Nikita Titov <[email protected]>
63387db
to
7afd37b
Compare
@jameslamb |
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jameslamb Thanks a lot for cleaning up aliases handling logic!
LGTM except one typo that you've already mentioned earlier
replaces mistaken parameter name "listen_time_out" with "time_out"
Co-authored-by: Nikita Titov <[email protected]>
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Based on #3789 (review). This PR proposes some changes in the
Booster
constructor to respect aliases for network parameters.This PR also has the following small changes
.pytest_cache/
directories (I found these were created but not ignored while I was testing)Notes for Reviewers
time_out
does not have any aliases (https://lightgbm.readthedocs.io/en/latest/Parameters.html#time_out)