-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: enable read cache and write cache when using remote object stores #5093
feat!: enable read cache and write cache when using remote object stores #5093
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe pull request introduces significant updates to configuration documentation and files for a system operating in standalone and distributed modes. Key changes include refined descriptions and default values for configuration parameters related to storage and caching, particularly for object storage. The modifications enhance clarity, address default cache settings, and improve error handling for directory creation. New methods for identifying object storage configurations have also been added, along with updates to integration tests to ensure comprehensive coverage of the new features. Changes
Assessment against linked issues
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5093 +/- ##
==========================================
- Coverage 84.03% 83.80% -0.23%
==========================================
Files 1168 1168
Lines 217572 217620 +48
==========================================
- Hits 182841 182383 -458
- Misses 34731 35237 +506 |
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (6)
src/datanode/src/store.rs (1)
149-168
: Handle potential errors when creating cache directoriesWhile using
tokio::fs::create_dir_all
to create the cache directory, consider handling potential I/O errors such as permission issues or existing paths that are not directories. Providing detailed error messages can aid in troubleshooting.src/mito2/src/config.rs (1)
97-97
: Update documentation for default write cache pathClarifying that the default path is
{data_home}/object_cache/write
helps users understand where the write cache is stored by default.config/datanode.example.toml (1)
297-304
: LGTM! Consider documenting the rationale for the cache size.The cache configuration changes look good. The documentation is clear, and the default paths follow a logical structure. The increased cache capacity from 1GiB to 5GiB should improve performance.
Consider adding a comment explaining why 5GiB was chosen as the default cache capacity, which could help users better understand if they need to adjust this value for their use case.
tests-integration/src/test_util.rs (1)
236-238
: LGTM! Consider adding test cases for cache behavior.The explicit cache path handling for different S3 storage configurations is clear and well-documented.
Would you like me to help generate additional test cases to verify:
- Cache behavior when disabled (empty string)
- Cache behavior when enabled with custom path
- Cache size limits and overflow scenarios
src/datanode/src/datanode.rs (1)
402-407
: Consider using WARN level for the configuration change log.While the code correctly enables write cache for object storage, this is a significant configuration change that affects performance and resource usage. Consider using WARN level instead of INFO to make this change more visible in logs, especially during the initial rollout period.
- info!("Configured 'enable_experimental_write_cache=true' for mito engine."); + warn!("Configured 'enable_experimental_write_cache=true' for mito engine.");src/mito2/src/worker.rs (1)
376-380
: Consider validating the cache path before creation.While the directory creation is implemented correctly, consider adding validation of the cache path to ensure it's absolute and has proper permissions. This can prevent issues in production environments with different permission models.
+ // Validate cache path + let cache_path = Path::new(&config.experimental_write_cache_path); + ensure!( + cache_path.is_absolute(), + InvalidConfigSnafu { + reason: format!( + "Write cache path must be absolute: {}", + config.experimental_write_cache_path + ) + } + ); + tokio::fs::create_dir_all(Path::new(&config.experimental_write_cache_path)) .await .context(CreateDirSnafu { dir: &config.experimental_write_cache_path, })?;
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (14)
config/config.md
(4 hunks)config/datanode.example.toml
(2 hunks)config/standalone.example.toml
(2 hunks)src/datanode/src/config.rs
(4 hunks)src/datanode/src/datanode.rs
(1 hunks)src/datanode/src/lib.rs
(1 hunks)src/datanode/src/store.rs
(4 hunks)src/mito2/src/config.rs
(4 hunks)src/mito2/src/error.rs
(2 hunks)src/mito2/src/worker.rs
(3 hunks)src/object-store/src/lib.rs
(1 hunks)tests-integration/src/test_util.rs
(1 hunks)tests-integration/tests/http.rs
(1 hunks)tests-integration/tests/main.rs
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- src/datanode/src/lib.rs
- src/object-store/src/lib.rs
🔇 Additional comments (14)
src/datanode/src/store.rs (1)
70-70
: Use of is_object_storage()
enhances code clarity
The introduction of the is_object_storage()
method improves readability by encapsulating the logic for determining if the store is an object storage.
tests-integration/tests/http.rs (1)
Line range hint 32-38
: Expand test coverage for OTLP endpoints
The addition of tests for OTLP metrics, traces, and logs significantly improves test coverage. Ensure these tests handle various scenarios, including error conditions and edge cases, to validate the robustness of the new endpoints.
tests-integration/tests/main.rs (1)
25-27
: Include HTTP tests for multiple storage backends
Adding http_tests!(File, S3, S3WithCache, Oss, Azblob, Gcs);
ensures HTTP functionalities are tested across different storage backends, enhancing reliability.
src/datanode/src/config.rs (2)
63-69
: Introduce is_object_storage()
method in ObjectStoreConfig
The new method simplifies checks for object storage types, promoting cleaner and more maintainable code.
102-107
: Add is_object_storage()
to StorageConfig
for consistency
Implementing is_object_storage()
in StorageConfig
provides a consistent interface for determining storage types at a higher configuration level.
src/mito2/src/config.rs (2)
154-154
: Increase default write cache size to 5 GiB
Increasing the default experimental_write_cache_size
provides better performance for workloads with high write throughput.
245-246
: Set default write cache path using OBJECT_CACHE_DIR
Using the OBJECT_CACHE_DIR
constant ensures consistency across the application when determining cache directory paths.
config/datanode.example.toml (1)
479-486
: LGTM! Verify cache directory creation permissions.
The write cache configuration changes are well-documented and consistent with the read cache changes. The default paths and cache sizes are logically structured.
Let's verify the cache directory creation behavior:
✅ Verification successful
Based on the search results, I can see that the write cache directory creation is properly handled in src/mito2/src/worker.rs
:
tokio::fs::create_dir_all(Path::new(&config.experimental_write_cache_path))
.await
.context(CreateDirSnafu {
dir: &config.experimental_write_cache_path,
})?;
The code uses create_dir_all()
which recursively creates directories if they don't exist. This is the same approach used for other cache directories in the codebase, like read cache creation in src/datanode/src/store.rs
.
Let's verify if there are any permission-related configurations or handling:
Cache directory creation is properly handled with appropriate error handling
The write cache directory creation is implemented securely in src/mito2/src/worker.rs
using tokio::fs::create_dir_all()
with proper error handling through the CreateDirSnafu
context. The implementation:
- Creates parent directories recursively if they don't exist
- Uses the same approach as other cache directories in the codebase
- Provides error context for troubleshooting
- Defaults to
{data_home}/object_cache/write
which follows the system's data directory permissions - Is automatically enabled for object storage with appropriate logging
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the code handles cache directory creation and permissions
# Test: Search for directory creation code
rg -A 5 "create_dir|create_directory|fs::create"
# Test: Search for permission handling
rg -A 5 "chmod|permission|mode"
Length of output: 76259
Script:
#!/bin/bash
# Search for write cache permission configurations
rg -A 5 "write_cache.*permission|write_cache.*mode"
# Search for any permission-related code near write cache
rg -B 5 -A 5 "experimental_write_cache"
Length of output: 18471
config/standalone.example.toml (2)
335-342
: LGTM! Configuration is consistent.
The read cache configuration mirrors the changes in datanode.example.toml, maintaining consistency across configuration files.
517-524
: LGTM! Write cache configuration is consistent.
The write cache configuration mirrors the changes in datanode.example.toml, maintaining consistency across configuration files.
src/mito2/src/error.rs (1)
679-684
: LGTM! Error handling implementation is thorough.
The new CreateDir
error variant is well-implemented with:
- Clear error message format
- Complete error context (directory path and IO error)
- Proper integration with the status code system
config/config.md (3)
96-96
: LGTM: Clear documentation for storage read cache configuration.
The documentation clearly explains that read cache is configured by default for object storage systems and provides the default path. The recommendation for better performance is helpful for users.
424-424
: LGTM: Consistent configuration across modes.
The datanode configuration mirrors the changes in standalone mode, maintaining consistency:
- Read cache configuration with the same default path
- Write cache enabled by default for object storage
- Same cache size of 5GiB
This consistency helps prevent confusion and ensures uniform behavior across deployment modes.
Also applies to: 462-464
134-136
: Verify the increased write cache size impact.
The write cache configuration has been updated with:
- Enabled by default for object storage
- Default path set to
{data_home}/object_cache/write
- Cache size increased significantly from 1GiB to 5GiB
While the changes improve performance, the increased cache size might impact memory usage.
✅ Verification successful
Let me generate additional scripts to check for any performance benchmarks or tests specifically related to write cache configuration:
The previous search showed a lot of memory-related code but we need to specifically focus on write cache performance implications. Let's check for any tests or documentation around write cache behavior and performance tuning.
Let me check the benchmark results and implementation details to better understand the write cache performance implications:
I want to verify if there are documented performance impacts and implementation safeguards around the increased cache size before making the final assessment.
Based on the gathered information, I can now provide a final response:
The increased write cache size to 5GiB is safe and well-managed.
The codebase shows several safeguards and optimizations around the write cache:
- The write cache is experimental and only enabled by default for object storage
- The implementation includes memory monitoring and metrics tracking
- Performance benchmarks (TSBS) show improved query latencies with larger cache sizes
- The cache has built-in TTL support and automatic cleanup mechanisms
- The system includes write buffer rejection thresholds to prevent memory pressure
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any memory-related issues or discussions in the codebase
# that might be relevant to the cache size increase.
# Search for memory-related configurations and comments
rg -i "memory|cache size|buffer size" --type rust
# Look for performance test results or benchmarks
rg -i "benchmark|performance test|memory usage" --type md
Length of output: 76499
Script:
#!/bin/bash
# Search for write cache related tests and benchmarks
rg -i "write.?cache|write.?buffer" --type rust -C 3 tests/
rg -i "write.?cache|write.?buffer" --type md
# Look for any performance related configuration docs
rg -i "performance|tuning" --type md config/
Length of output: 4243
Script:
#!/bin/bash
# Check benchmark results and implementation details
rg -A 10 "write buffer" docs/benchmarks/tsbs/v0.3.2.md
# Look for write cache implementation and tests
rg -i "write.?cache" --type rust -A 5 src/mito2/
Length of output: 32591
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Others LGTM.
Is it necessary to change the default path? I'm ok with the new naming but since it's a breaking change
I was hesitant as well, but I believe that failing to standardize the paths for read and write caches will lead to growing chaos in |
@evenyag Please help to take a look |
2667bec
to
2db232e
Compare
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
Close #4813
What's changed and what's your intention?
{data_home}/object_cache/read
and create subdirectories for multiplex storages by their names.{data_home}/object_cache/write
, as the write cache key,IndexKey
, includesregion_id
, eliminating the need for subdirectories for multiplex storages.@evenyag Please take a look.
Checklist
Summary by CodeRabbit
Release Notes
New Features
Improvements
Bug Fixes