-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: introduce the interface of RemoteJobScheduler
#4124
feat: introduce the interface of RemoteJobScheduler
#4124
Conversation
c9d3214
to
90a4794
Compare
@sunng87 @shuiyisong The PR contains part of the plugins of datanode, PTAL. |
@evenyag @v0y4g3r When I tried to design the API for rebuilding the job context if the datanode is restarted, I realized that we might need to keep metadata about the latest compaction status. It's useful if we use local compaction for heavy compaction tasks. The metadata may be part of region metadata, for example:
When starting the datanode, it can fetch the metadata(open regions) and decide whether to schedule the compaction task. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4124 +/- ##
==========================================
- Coverage 85.16% 84.83% -0.33%
==========================================
Files 1020 1021 +1
Lines 179630 179777 +147
==========================================
- Hits 152976 152519 -457
- Misses 26654 27258 +604 |
|
||
/// RemoteJobScheduler is a trait that defines the API to schedule remote jobs. | ||
#[async_trait::async_trait] | ||
pub trait RemoteJobScheduler: Send + Sync + 'static { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a single trait for both remote and local job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's possible in theory, for example:
#[async_trait::async_trait]
pub trait Scheduler: Send + Sync + 'static {
async fn schedule(&self, job: Job, notifier: Option<Arc<dyn Notifier>>) -> Result<Option<JobId>>;
}
src/mito2/src/request.rs
Outdated
@@ -693,7 +693,7 @@ pub(crate) struct CompactionFinished { | |||
/// Region id. | |||
pub(crate) region_id: RegionId, | |||
/// Compaction result senders. | |||
pub(crate) senders: Vec<OutputTx>, | |||
pub(crate) senders: Option<Vec<OutputTx>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember we use empty vec for None
in our code base @evenyag
d08b56e
to
5bcd7a1
Compare
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
What's changed and what's your intention?
RemoteJobScheduler
For the storage disaggregated system, we can always offload the CPU-intensive and IO-intensive tasks(for example, compaction and index) to the remote workers. For the above scenario, the PR introduces the abstraction.
RemoteJobScheduler
is a trait that defines the API for scheduling remote jobs. Its implementation is in GreptimeDB Enterprise.The PR modify
schedule_compaction_request()
to support remote compaction:remote_compaction
inregion_options
and theRemoteJobScheduler
is initialized, the Mito will execute remote compaction;Other changes
Add the
async
keyword for all the compaction-related functions because theschedule_compaction_request
needs to beasync
;Use
Option
type forsenders
inCompactionFinished
because we don't need it in remote compaction scenario;Add
remote_compaction
in compaction options;TODOs
RemoteJobScheduler
from the plugin system;Design the API to fetch the Jobs from the scheduler. When the datanode restarts, it can rebuild the context of the remote job;Add the unit tests for the;RemoteJobScheduler
Checklist