Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support unspill for SpillableHostBuffer #12184

Open
binmahone opened this issue Feb 20, 2025 · 0 comments · May be fixed by #12186
Open

[FEA] Support unspill for SpillableHostBuffer #12184

binmahone opened this issue Feb 20, 2025 · 0 comments · May be fixed by #12186
Assignees
Labels
? - Needs Triage Need team to review and classify feature request New feature or request

Comments

@binmahone
Copy link
Collaborator

binmahone commented Feb 20, 2025

Is your feature request related to a problem? Please describe.

Currently once a SpillableHostBuffer is spilled from memory to disk, all subsequent invocations of SpillableHostBuffer#getHostBuffer will read and deserialize from disk. It's very costly and won't be acceptable in cases where we will call the getHostBuffer multiple times.

One example would be the Kudo shuffle read concat case, let's asssume the read KudoTables are placed into a spillable state (by wrapping the HostMemoryBuffer in SpillableHostBuffer), then when doing the kudo concat, we will have to frequently and randomly call SpillableHostBuffer#getHostBuffer, since we know kudo concat adopts a random read visitor to read all the input KudoTables. It's a performance nightmere if we have to read from disk every time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant