RFC-004: Design of TaskManager and related offline jobs #732
tobegit3hub
started this conversation in
RFCs
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Problem
Currently, OpenMLDB doesn't have any service to manager the internal batch jobs.
For example, users have to submit the Spark-based OpenMLDB batch job by themselves and initialize the Java/Hadoop/Spark environment in advance. Users have to implement the Spark or MapReduce jobs for importing data in OpenMLDB and submit by themselves. The batch jobs are out of control since OpenMLDB doesn't have job manager to monitor users batch jobs.
Moreover, setting up the client of OpenMLDB is complex if it need to submit the distributed jobs to Yarn or Kubernetes cluster.
For example, the Java package, pre-built Spark distribution files and Hadoop configuration files are required to submit OpenMLDB batch job.
Solution
We will add a new internal module named TaskManager to implement management of batch jobs.
Here is the architecture of TaskManager in OpenMLDB.
TaskManager is the RPC services which provides APIs to submit/monitor/terminate batch jobs. Here is the workflow when clients request to submit batch jobs.
Notice that batch jobs may run for serval minutes or serval days. TaskManager may response once it gets the basic job status in seconds while the job is still running. Users have to request the job status to check if the job has been finished by themselves.
After adding TaskManager service, we will implement some internal batch jobs which is useful for OpenMLDB users.
More jobs will be added in the future but we will not discuss in this RFC, such as submitting Kubernetes or Flink jobs.
The basic job info will be defined below.
There are extra job info can be returned when getting detail job info.
Changes and Additions to Public Interfaces
TaskManager will add some public APIs. The APIs are defined as bRPC Protobuf functions and users can request with HTTP endpoints as well.
TaskManager will add some commands for CLI.
Performance Impact
This is the new component and will not affect the current performance.
Backwards Compatibility and Upgrade Path
This is the new component and doesn't has backwards compatibility and upgrade issues.
Related Work
Beta Was this translation helpful? Give feedback.
All reactions