Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c++] Fix memory leak in Arrow table implementation #6314

Merged
merged 3 commits into from
Feb 22, 2024

Conversation

borchero
Copy link
Collaborator

Motivation

When trying to integrate LightGBM v4.2 into our codebase, we noticed an unexpected increase in memory consumption when using LightGBM's new Arrow interface under certain circumstances.

It turns out that the Arrow implementation causes memory leaks as it does not properly release the Arrow arrays being passed to it. Unfortunately, this issue only really surfaces if the data being passed to Arrow is not in memory during the entire program execution (which is frequently the case) and the leakage really only becomes an issue if different data is passed to the LightGBM Arrow interface multiple times (this is where we discovered it).

The fix is simple and the reasoning behind it is outlined clearly in the Arrow C Data interface specification.


Huge shoutout to @delsner who helped me debug this memory leak and helped me learn a lot more about Arrow, allocators and memory debugging 😁


Sorry for not already including this in #6034 and its sibling PRs... I still had some gaps in my understanding of the Arrow C Data interface 👀

@borchero
Copy link
Collaborator Author

FYI, if anyone has a good idea how to write good tests for this, please let me know 👀 once this is merged, we should probably also cut a new release soonish to allow people to confidently use the Arrow interface.

~ArrowTable() {
// As consumer of the Arrow array, the Arrow table must release all Arrow arrays it receives
// as well as the schema. As per the specification, children arrays are released by the
// producer. See: https://arrow.apache.org/docs/format/CDataInterface.html#release-callback-semantics-for-consumers
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for including this link, super helpful!

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll defer to @guolinke or @shiyu1994 for a review on this.

Just want to add that if/when we add support for Arrow data via the R-package here, we should be able to catch such things through the valgrind checks we run on the R package: https://github.com/microsoft/LightGBM/tree/master/R-package#valgrind

@jameslamb jameslamb added the fix label Feb 14, 2024
@borchero
Copy link
Collaborator Author

Just want to add that if/when we add support for Arrow data via the R-package here

Would be great! I myself, however, have close to no experience with R, unfortunately 😅

Copy link
Collaborator

@shiyu1994 shiyu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the excellent fix.

@shiyu1994
Copy link
Collaborator

I've approved this PR. Could you update the branch to sync with master so that it can be merged?

@borchero
Copy link
Collaborator Author

@jameslamb can you tell me how to retry jobs run on Azure? I can't seem to find a button that allows me to do this.

@jameslamb
Copy link
Collaborator

@jameslamb can you tell me how to retry jobs run on Azure? I can't seem to find a button that allows me to do this.

It's a Microsoft-controlled account so I suspect @shiyu1994 would have to add explicitly add you. I believe @guolinke did that for me a few years ago while still at Microsoft. Message @shiyu1994 in the maintainer slack and hopefully he can get you set up.

Until then, @ me any time and I can re-run jobs for you. I'll re-run the ones from the PR in a few minutes.

@shiyu1994
Copy link
Collaborator

shiyu1994 commented Feb 22, 2024

@borchero Could you register an Azure DevOps account with the same email as your Github? Then, I'll add you to the Azure DevOps Team for our CI jobs.

@shiyu1994 shiyu1994 merged commit 8b61a15 into microsoft:master Feb 22, 2024
43 checks passed
@borchero
Copy link
Collaborator Author

Done @shiyu1994, thanks a lot! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants