Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add byte limit implementation for cache #40

Open
wants to merge 24 commits into
base: dev/iavl_data_locality
Choose a base branch
from

Conversation

p0mvn
Copy link
Member

@p0mvn p0mvn commented Apr 2, 2022

Background

This PR introduces byte limit implementation for cache and switches the fast cache to this implementation. It introduces a small decrease in performance in exchange for modular and reusable design as well as the ability to keep track of the number of bytes used.

Deployed nodes with 50MB, 75MB, 100MB, and 150MB fast cache. The nodes are stable after running for a day, and I'm waiting to have more time to understand the RAM usage with this change better.

Implementation Details

Refactored cache module to have a general LRU cache implementation. Introduced node and byte limit decorators to wrap around LRU cache and provide a composable design.

osmosis-labs/osmosis#1187

Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-4    4.32µs ± 6%    4.29µs ± 5%    ~     (p=0.768 n=5+10)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-4    16.1µs ± 3%    16.9µs ± 1%  +4.87%  (p=0.002 n=5+8)
Medium/goleveldb-100000-100-16-40/query-hits-fast-4                     611ns ± 9%     606ns ±12%    ~     (p=0.679 n=5+10)
Medium/goleveldb-100000-100-16-40/query-hits-slow-4                    21.9µs ± 5%    22.6µs ± 6%    ~     (p=0.075 n=5+10)
Medium/goleveldb-100000-100-16-40/iteration-fast-4                     82.1ms ± 6%    78.4ms ± 8%    ~     (p=0.099 n=5+10)
Medium/goleveldb-100000-100-16-40/iteration-slow-4                      1.92s ± 1%     1.96s ± 3%  +2.18%  (p=0.020 n=4+9)
Medium/goleveldb-100000-100-16-40/update-4                              247µs ± 8%     251µs ±12%    ~     (p=0.768 n=5+10)
Medium/goleveldb-100000-100-16-40/block-4                              31.3ms ± 8%    31.6ms ± 4%    ~     (p=0.513 n=5+10)

name                                                                 old alloc/op   new alloc/op   delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-4      814B ± 0%      814B ± 0%    ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-4    1.40kB ± 1%    1.41kB ± 1%  +0.70%  (p=0.018 n=5+10)
Medium/goleveldb-100000-100-16-40/query-hits-fast-4                     0.00B          0.00B         ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-slow-4                    2.00kB ± 0%    1.99kB ± 1%    ~     (p=0.117 n=5+10)
Medium/goleveldb-100000-100-16-40/iteration-fast-4                     29.3MB ± 0%    29.3MB ± 0%    ~     (p=0.513 n=5+10)
Medium/goleveldb-100000-100-16-40/iteration-slow-4                      276MB ± 0%     276MB ± 0%    ~     (p=0.210 n=4+8)
Medium/goleveldb-100000-100-16-40/update-4                             46.5kB ± 3%    47.5kB ± 4%    ~     (p=0.129 n=5+10)
Medium/goleveldb-100000-100-16-40/block-4                              5.43MB ± 1%    5.65MB ± 8%    ~     (p=0.129 n=5+10)

name                                                                 old allocs/op  new allocs/op  delta
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-fast-4      16.0 ± 0%      16.0 ± 0%    ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-no-in-tree-guarantee-slow-4      24.0 ± 0%      24.4 ± 2%    ~     (p=0.308 n=5+10)
Medium/goleveldb-100000-100-16-40/query-hits-fast-4                      0.00           0.00         ~     (all equal)
Medium/goleveldb-100000-100-16-40/query-hits-slow-4                      34.0 ± 0%      34.0 ± 0%    ~     (all equal)
Medium/goleveldb-100000-100-16-40/iteration-fast-4                       523k ± 0%      523k ± 0%    ~     (p=0.354 n=5+10)
Medium/goleveldb-100000-100-16-40/iteration-slow-4                      4.71M ± 0%     4.71M ± 0%  +0.01%  (p=0.004 n=4+8)
Medium/goleveldb-100000-100-16-40/update-4                                486 ± 7%       516 ±11%    ~     (p=0.053 n=5+10)
Medium/goleveldb-100000-100-16-40/block-4                               60.4k ± 1%     62.0k ± 4%    ~     (p=0.099 n=5+10)
name                                      old time/op    new time/op    delta
Add/small_-_limit:_10K,_key_size_-_10b-4    1.09µs ± 6%    1.08µs ± 3%    ~     (p=0.937 n=5+5)
Add/med_-_limit:_100K,_key_size_20b-4       1.29µs ± 1%    1.31µs ± 0%  +1.37%  (p=0.024 n=5+5)
Add/large_-_limit:_1M,_key_size_30b-4       1.38µs ±10%    1.39µs ±11%    ~     (p=0.841 n=5+5)

name                                      old alloc/op   new alloc/op   delta
Add/small_-_limit:_10K,_key_size_-_10b-4      106B ± 1%      106B ± 0%    ~     (p=0.333 n=5+4)
Add/med_-_limit:_100K,_key_size_20b-4         127B ± 0%      127B ± 0%    ~     (all equal)
Add/large_-_limit:_1M,_key_size_30b-4         138B ± 1%      138B ± 1%    ~     (p=1.000 n=4+4)

name                                      old allocs/op  new allocs/op  delta
Add/small_-_limit:_10K,_key_size_-_10b-4      4.00 ± 0%      4.00 ± 0%    ~     (all equal)
Add/med_-_limit:_100K,_key_size_20b-4         4.00 ± 0%      4.00 ± 0%    ~     (all equal)
Add/large_-_limit:_1M,_key_size_30b-4         4.00 ± 0%      4.00 ± 0%    ~     (all equal)

@p0mvn p0mvn marked this pull request as ready for review April 4, 2022 05:07
fast_node_test.go Outdated Show resolved Hide resolved
@@ -53,6 +54,12 @@ func (fn *FastNode) GetKey() []byte {
return fn.key
}

func (fn *FastNode) GetFullSize() int {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a godoc describing the value returned here? It's not super clear to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed

type lruCacheWithBytesLimit struct {
lruCache
bytesLimit int
curBytesEstimate int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it called estimate? Should we add a comment that it may potentially slightly undercount, but we deem it fine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is called estimate because I did not find a straightforward way to test actual allocations.

I tried grabbing memory from runtime to test but it was slightly off. I think it's because it doesn't produce the memory needed for allocating "slice metadata" - length, capacity and pointer to data.

This estimate is based on the knowledge of how slices and strings are represented in memory in Go. That's why I named it estimate.

I'll add a comment about this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh gotcha. Yeah being off by a small constant for go memory layout details (which aren't guaranteed to be preserved across versions) is totally fine!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is added

Comment on lines +18 to +22
const (
LRU Type = 0
LRU_node_limit Type = 1
LRU_bytes_limit Type = 2
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on switching this to iota syntax? https://yourbasic.org/golang/iota/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also are we using all three types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using the last 2 in iavl, and the first one is the abstract implementation of the last 2.

I made it so that it can't be initialized outside of cache package. However, it still needs to have its own GetType() method

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the first one to be implemented? Perhaps we make a follow-up issue to delete it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh the point is its needed, since we're doing things decorator style.

I guess I don't understand the type enum + decorator syntax combination. I don't think it needs to block the PR, but maybe we should make a follow-up issue about it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The decorator pattern implies that LRU cache (the underlying implementation) can still be used on its own. We just don't use it in IAVL. Decorators are wrappers around the main abstraction to provide an additional layer of functionality. In our case, this functionality is limiting the cache.

We have three types implemented:

type Type int

const (
	LRU             Type = 0
	LRU_node_limit  Type = 1
	LRU_bytes_limit Type = 2
)

Only the last 2 are used in IAVL directly. However, this is composable by design - if we want the limit to be removed in the future - we just swap the cache for the regular LRU type

Let me know if this makes sense

nodedb.go Outdated
@@ -29,7 +29,7 @@ const (
// Using semantic versioning: https://semver.org/
defaultStorageVersionValue = "1.0.0"
fastStorageVersionValue = "1.1.0"
fastNodeCacheLimit = 100000
fastNodeCacheLimit = 100 * 1024 * 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should rename this to fastNodeCacheBytesLimit right?

@faddat
Copy link
Member

faddat commented Apr 18, 2022

I am working on speedruns for db comparisons. So I am going to be a teensy bit bold and backport this now. I will report back on if it successfully resolves the issue that I am hitting with rocksdb (basically v6 becomes extremely ram hungry and I've not been able to do anything to change that.)

Previous state: Nodes would reach 128gb and be killed by the oom reaper

New sate: unknown

Speaking unscientifically, the update seems much faster.

@faddat faddat mentioned this pull request Apr 18, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants