forked from apache/datafusion
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support physical plan reusage #2
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
d137f2a
to
5b53af5
Compare
fa55038
to
da18987
Compare
da18987
to
23beb3c
Compare
23beb3c
to
883e0f9
Compare
883e0f9
to
8c23e1f
Compare
This patchs adds a support for placeholders on the physical expression level. It allows to use generic plans and then resolve placeholders on the execution stage. Placeholders are resolved on the `execution(...)` phase. Each `ExecutionPlan` is responsible for resolving placeholders in it's own expressions. Example: ```sh > create table a(x int); > explain select x + $1 from a where x > $2; +---------------+-------------------------------------------------------------------------+ | plan_type | plan | +---------------+-------------------------------------------------------------------------+ | logical_plan | Projection: a.x + $1 | | | Filter: a.x > $2 | | | TableScan: a projection=[x] | | physical_plan | ProjectionExec: expr=[x@0 + $1 as a.x + $1] | | | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 | | | CoalesceBatchesExec: target_batch_size=8192 | | | FilterExec: x@0 > $2 | | | MemoryExec: partitions=1, partition_sizes=[0] | | | | +---------------+-------------------------------------------------------------------------+ ```
To share physical plans across executions we need to place metrics in some other place. This patch moves them to the task context. When plan is scanned metrics set is registered in the context by the plan address. To display a plan with metrics one should provide the task context, where metrics associated with the plan are stored. Also, fmt errors are fixed and applied several clippy suggestions.
8c23e1f
to
574ce4e
Compare
893ca43
to
1b777b2
Compare
44ed854
to
df20563
Compare
We want to run on external PRs, but not on our own internal PRs as they'll be run by the push to the branch. The main trick is described here: Dart-Code/Dart-Code#2375 Also we want to run it always for manually triggered workflows.
We do not support wasm datafusion for now, so let's disable this job.
df20563
to
3300604
Compare
ole-baranov
approved these changes
Feb 10, 2025
WeCodingNow
approved these changes
Feb 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
To reuse physical plans 2 things were added:
Support for physical placeholders. It is done by adding a new physical expression. Param values are added to the execution context. On the
execute(...)
phase eachExecutionPlan
is responsible for resolving placeholders in its own expressions.Move metrics from
ExecutionPlan
to the execution context. To share physical plans across executions we need toplace metrics in some other place. When plan is executed metrics set is registered in the context by the plan address. To display a plan with metrics one should provide the task context, where metrics associated with the plan are stored.
Also, fmt errors are fixed and applied several clippy suggestions.