-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design or prototype distributed execution #25
Comments
This seems very interesting... Besides running a single query on multiple pods this would also allow compute parts of the query closer to the data (on the storage nodes) and reduce network traffic quite a bit, right? As we would not need transfer raw data anymore. |
I was thinking a lot about this topic and I think the main problem at least in the Thanos space is that we do read-time deduplication instead of write-time deduplication. In my opinion, the problem with read-time deduplication is that we don't know whether there are any gaps in the data stored in a node (or any block) thus we need to download everything (all matching blocks) to be able to deduplicate effectively. If we would have identical copies of deduplicated data on multiple replicas then we could effectively execute a given query if all of the needed data for that query resides in the replica of that data. |
Indeed, another possibility would be to support sharding natively, so we could split aggregation over multiple pods? |
That would be the way to go I think. But it should really be up to the user to decide where the engine will run. So in theory, if you can guarantee unique data in a Store, you could also embed an engine there. |
We would likely need to a logical plan first, but we can already start thinking about distributed query execution and how we can implement it in the engine.
I would avoid adding networking specifics to the engine itself, and defer that responsibility that to the library user. This way, a project that uses the engine might decide to inject an implementation based on gRPC, plain HTTP, or some other protocol for communicating between engine instances.
The engine itself would define an interface for inter-engine communication, and would make sure the query is properly decomposed and merged back after all parts of the plan complete successfully.
The text was updated successfully, but these errors were encountered: