Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object Store Caching layer #2776

Open
ion-elgreco opened this issue Aug 15, 2024 · 2 comments
Open

Object Store Caching layer #2776

ion-elgreco opened this issue Aug 15, 2024 · 2 comments
Labels
binding/rust Issues for the Rust crate enhancement New feature or request help wanted Extra attention is needed

Comments

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Aug 15, 2024

Description

Use Case
We currently in the worst case read them both in full twice, we can however cache the first round of reading, which will reduce the second round of getting files from an object store

Related Issue(s)
See discussion here #2760

@ion-elgreco ion-elgreco added the enhancement New feature or request label Aug 15, 2024
@ion-elgreco ion-elgreco changed the title Add caching mechanism for log/checkpoint streame Add caching mechanism for log/checkpoint stream Aug 15, 2024
@rtyler rtyler added the binding/rust Issues for the Rust crate label Aug 15, 2024
@ion-elgreco ion-elgreco changed the title Add caching mechanism for log/checkpoint stream Object Store Caching layer Feb 22, 2025
@ion-elgreco ion-elgreco added the help wanted Extra attention is needed label Feb 22, 2025
@scovich
Copy link

scovich commented Feb 25, 2025

General I/O caching support is engine's responsibility, not kernel's. Kernel just needs to not get in the way of an engine to implement such caching.

To that end, it would be more helpful to identify specific ways kernel blocks or impedes the engine from doing the kinds of caching it would like to do.

For example, the JsonHandler and ParquetHandler traits provided by the engine should make ideal hook points for caching the results of file reads. The engine currently performs file writes on its own, without kernel involvement, and so it would be up to engine to introduce appropriate caching there as it sees fit.

On the other hand, I know @roeap has been considering higher level caching approaches in delta-rs that would capture the result of log replay rather than going back to the underlying individual files at all. As we identify such optimization opportunities, kernel APIs may need to adapt if the engine currently lacks appropriate hook points.

@scovich
Copy link

scovich commented Feb 25, 2025

Heh. I missed that this was a delta-rs issue, not delta-kernel-rs!

That said, if this work does identify gaps in kernel APIs, please do file enhancement issues against delta-kernel-rs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants