Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Add missing prediction functions to R interface #4982

Closed
wants to merge 8 commits into from

Conversation

david-cortes
Copy link
Contributor

Fixes #4966

This PR adds R wrappers over important C-level functions for predictions which are currently missing from the R interface - particularly, functions dealing with single-row predictions and CSR data.

After issue #4980 is sorted out, the documentation for lgb.configure_fast_predict needs to be updated to reflect the details about it.

@david-cortes
Copy link
Contributor Author

The sanitizer issues are not from this PR and should be fixed in this other PR: #4984

@david-cortes david-cortes requested a review from jmoralez as a code owner March 23, 2022 21:25
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on!

I believe this PR has two large and unrelated sets of changes (supporting single-row predictions and supporting predictions on more sparse array types).

Please break off the "support predicting on more sparse array types" changes into a separate PR to reduce the effort required to review these changes, or explain to us why you feel these two sets of changes need to be made together.

@david-cortes
Copy link
Contributor Author

If it were to be done as separate PRs in parallel, there would be large merge conflicts, and if it were to be done in sequential PRs, the first one would need slightly different code to leave sometghing usable, which the second would then have to undo to arrive at the current code.

It could be done in two PRs, but it's more work, and they both would look similar and touch the same parts of the prediction function.

@jameslamb
Copy link
Collaborator

if it were to be done in sequential PRs, the first one would need slightly different code to leave something usable, which the second would then have to undo to arrive at the current code.

it's more work

Reviewing a PR this large is, in my experience, more work for reviewers than the sum of reviewing smaller changesets that add up to this PR's changes. Maintainers in this project regularly context-switch between many different issues and PRs (here and in other open source projects), and PRs that are this large require a substantial amount of dedicated time to re-gather the right context to be able to thoroughly review changes.

Looking at the diff tonight, it seems to me that it should be possible to make these changes in the following sequence without any merge conflicts or half-solutions:

  1. support predcontrib predictions for all sparse types
  2. support non-predcontrib predictions for all sparse types (using only the multi-row prediction interfaces, e.g. LGBM_BoosterPredictSparseOutput_R)
  3. support fast single-row predictions (e.g. LGBM_BoosterPredictForMatSingleRow) predictions for all types supported by predict()

If you're willing to do that, we'd appreciate it. If you're still not open to that after reading the suggestion above, then we can leave this as one PR but I expect there will be a much longer review cycle.

@david-cortes
Copy link
Contributor Author

Let's break it into 4-5 PRs then.

@jameslamb
Copy link
Collaborator

Great, thank you very much!

@jameslamb
Copy link
Collaborator

I believe this can be closed, based on the description of #5312.

This PR adds the remainder of prediction functions from PR #4982... The idea in #4982 was to split the additions into multiple PRs, but ... this PR just adds the rest of C prediction functions that are not in the R interface together.

@david-cortes correct me if I'm wrong about that and this can be re-opened.

Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[R-package] Cannot pass CSR single-row for prediction
2 participants