You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In our current system, an index cache miss triggers the loading of the entire index file from s3, regardless of the actual amount of data required. This behavior causes significant inefficiencies in scenarios where only a small portion of the index file is needed for the query.
In some cases, only a small fraction of the index data is accessed. Despite this, the system downloads the entire 200 MiB index file. The download time dominates the query's total execution time, leading to degraded performance.
Implementation challenges
No response
The text was updated successfully, but these errors were encountered:
Currently the cache in mito can be divided into 2 kinds:
Cache for relatively small structs like metadata
SstMetaCache, VectorCache, metadata in InvertedIndexCache
Cache for some large content
PageCache, contents in InvertedIndexCache
For the first, we could keep the current cache management strategy, which means the caller is responsible for fetch values and put cache. We can provide a register method and a pair of getter/setter for extensity:
For the second, we could provide a page-based cache strategy. Here the upper caller simply gives offsets and size, and the cache manager is responsible for featching values and put cache. Similar to #5114
WenyXu
changed the title
Optimize Index cache to avoid full Index file loading on cache miss
Optimize Index cache to avoid inverted Index file loading on cache miss
Dec 10, 2024
The cache granularity of inverted index has been changed to a fixed size of page in #5114, and several optimizations have been introduced in #5145, #5146, #5147 and #5148.
I think we can close the issue for now? cc @WenyXu
What type of enhancement is this?
Performance
What does the enhancement do?
In our current system, an index cache miss triggers the loading of the entire index file from s3, regardless of the actual amount of data required. This behavior causes significant inefficiencies in scenarios where only a small portion of the index file is needed for the query.
In some cases, only a small fraction of the index data is accessed. Despite this, the system downloads the entire 200 MiB index file. The download time dominates the query's total execution time, leading to degraded performance.
Implementation challenges
No response
The text was updated successfully, but these errors were encountered: