Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key positioning clarification #27

Closed
delaneyj opened this issue Jan 21, 2025 · 7 comments
Closed

Key positioning clarification #27

delaneyj opened this issue Jan 21, 2025 · 7 comments
Assignees
Labels
documentation Improvements or additions to documentation question Further information is requested

Comments

@delaneyj
Copy link
Contributor

Since it's not documentation, when a key is retrieved is it rewritten to the memtable (so oldest keys live in the lowest level)? Basically does it act as an lru or do key live at the level they were original written?

@guycipher
Copy link
Member

L0 - Memtable (latest version if key resides)
L1+ - if not found in memtable we check from first level (where we flush) to last. Where ever we find that key we stop. In regards to range and filters we don't supersede if existent in result set.

@guycipher
Copy link
Member

guycipher commented Jan 21, 2025

On compactions older keys living in older levels are merged with new levels. Only the latest version will live on, if tombstoned, it is deleted.

@guycipher
Copy link
Member

Should probably write this up on documentation somewhere and on website with some nice images! For sure :) I will do that when I get some time after work, most certainly. I appreciate the questions @delaneyj

@guycipher guycipher self-assigned this Jan 21, 2025
@guycipher guycipher added documentation Improvements or additions to documentation help wanted Extra attention is needed labels Jan 21, 2025
@guycipher
Copy link
Member

This phenomenon is called write amplification. When a key lives in multiple levels. Over time compactions try to rid this.

@delaneyj
Copy link
Contributor Author

I grok the write amplification, just wanted to see if once a key is written it's not going to percolate back up with retrievals. If you have iterators #26 it's less of a concern.

@guycipher
Copy link
Member

Understood. Yeah I get it. Equality operations will be quicker because the bloom can identify if key sits in an sstable klog. Doesn't have to scan a bunch of times. in regards to iterator functions, yeah you could use a min and max key per sstable but then you'd have to store those somewhere and what if keys are large? So that's the thing with that, memory efficiency. There are other types of filters that allow searching ranges as well. Could be looked at. https://db.cs.cmu.edu/papers/2019/20_srf-zhang.pdf

@guycipher guycipher added question Further information is requested and removed help wanted Extra attention is needed labels Jan 21, 2025
guycipher added a commit that referenced this issue Jan 22, 2025
update read me regarding issue #27 and issue #30
@guycipher
Copy link
Member

Added info on read me.
#38

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants