-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat position bias via GAM in LambdaMART #5929
Conversation
@microsoft-github-policy-service agree company="Microsoft" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments.
The static check CI job is failing. And there are 5 linter errors to be fixed from the log. Thanks.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor revisions.
/gha run r-valgrind Workflow R valgrind tests has been triggered! 🚀 Status: success ✔️. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright I've quickly skimmed the papers linked the documentation, and think I finally understand now how this is working.
I'm ok with removing the parameter to provide a customized filename... thanks @shiyu1994 for the explanation in #5929 (comment).
I'm comfortable with the changes to LightGBM's Python and C API (other than a minor comment about parameter naming) and I'd be happy to implement the R side of this after this PR is merged. Can one of you please create a new issue at https://github.com/microsoft/LightGBM/issues to track that work?
Please see my most recent round of comments. I still feel that:
- the tests as currently written don't provide very strong evidence of this implementation's correctness
- the documentation is unclear about the expected content of
positions
I'll watch this PR closely this week and try to respond quickly as you comment and ask for new reviews. I understand there's some urgency inside Microsoft on getting this merged. But please note that I'll be traveling Tuesday-Thursday with limited available time here on GitHub.
/gha run r-valgrind Workflow R valgrind tests has been triggered! 🚀 Status: success ✔️. |
@jameslamb, thank you for your effort to learn more about the problem and the proposed solution! And thank you for offering your help in adding the R language support for it. Here is the issue I created for that: #6063. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for accepting my proposed rewording for the documentation, for the effort you put into the excellent new tests, and for documenting the R addition in #6063.
Please fix the compilation errors (looks like there are a few more places where lambdarank_position_bias_regularizer
needs to be changed to lambdarank_position_bias_regularization
). After that, I'll run the valgrind checks one more time.
Besides that, I have no other major comments left that are worth blocking merging. There are some minor code things I'd like to change in the tests, but I'll propose that in a follow-up PR after this.
Now that I understand it (thanks for your patience with me 😅 ) I'm very excited to see this feature make it into LightGBM! It's been 2.5 years since @robhowley first introduced #4531. Thanks to you and @shiyu1994 for all your efforts in making this happen.
I also want to tag @thvasilo, who was very involved in the discussion in #4531, just so they're aware.
gbm_unbiased_with_file = lgb.train(params, lgb_train, valid_sets = lgb_valid, num_boost_round=50) | ||
|
||
# the performance of the unbiased LambdaMART should outperform the plain LambdaMART on the dataset with position bias | ||
assert gbm_baseline.best_score['valid_0']['ndcg@3'] + 0.03 <= gbm_unbiased_with_file.best_score['valid_0']['ndcg@3'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, thank you for this! I think we'll be glad to have this stricter test.
Co-authored-by: James Lamb <[email protected]>
/gha run r-valgrind Workflow R valgrind tests has been triggered! 🚀 Status: success ✔️. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of my remaining comments have been addressed by @shiyu1994 's recent comments, so marking this "approved".
Please do wait to see if the valgrind test passes one more time before merging though: #5929 (comment)
Thank you both for all the hard work that went into this addition to LightGBM!
@jameslamb, thank you again for your diligent reviews and prioritizing the quality of LightGBM! Many thanks to @shiyu1994 for your help and support! I'm looking forward to future collaborations with you guys! |
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Add support to counter position bias present in relevance labels for learning-to-rank task. Position bias is modeled as an additive factor in logistic space within Generalized Additive Model (GAM) framework.