-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoTVM] New rank-binary loss_type for the new xgboost >= 2.0.0 behaviour #14468
Conversation
Given that we are doing cost model. I am not sure if binarization is the best approach here. Can you dump out the labels and check the current assigned behavior? Likely we might want to move away from the MAP metric, and use other metric instead, either regression metric or pair-wise ranking. |
Another quick idea for now is to add condition of binarization to xgboost >= |
@tqchen , additionally to my response to the request in the previous message:
|
I think in this case we should change ranking loss to regression loss, use logistic regression so the values can still be used. binarization causes too much info loss |
@tqchen ,
Updates:
I updated this PR code to do binarization only in case:
Updated here the code, the title, the first comment (barred out any erroneous info). |
I see, i think we should report error if binarization is needed, since the original intention was continuous prediction. I know it might still work OK, but that was not the intention of the cost predictor. Would be good to visit the default choice, i think if ranking is not possible, reg:logistic would be another good choice usually |
@tqchen ,
Let me know if still need changes or more polishing, I stop here for now. |
Thanks @cbalint13 ! we still need to make sure the default |
@tqchen ,
There are still tutorials / applications that use explicit "rank", would like to change all of them ? |
@cbalint13 Yes, let us update and to change all to reg. cc @junrushao to double check cases in MetaSchedule. Might be useful to use reg:logistic, if the output is scaled into [0, 1] |
@tqchen ,
My thought on the newly introduced
I would leave it as described, if would like we can create another one |
In continuation of previous comment, I also attach here some test result. Comparative test confirms that
Note:
|
@tvm-bot rerun |
Signed-off-by: Balint Cristian <[email protected]>
This PR fix the latest xgboost >= 2.0.0 behaviour requiring binarized labels.
This address
bothonlyautotune
(and).autoscheduler
Note:
Unsure about TVM overall tunner impact, but we can introduce more sophisticated way of measuring AP like PASCAL evenly spaced one, the advantages are unclear and would require extensive comparative tests.
The errors cought on TVM autotune process:
Cc @Sunny-Island , @zxybazh , @junrushao , @vinx13 , please help with the review.
Thanks,
~Cristian.
Update:
reg
(reg:linear)loss_type
is fine.rank
(rank:pairwise)loss_type
is affected.