Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Volcano query planner #309

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
184 changes: 184 additions & 0 deletions planner/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
Proposal: separate between AST & execution, refactor query planner
===

## Background

Quotes from https://github.com/thanos-io/promql-engine/issues/5

```
We currently translate the AST directly to a physical plan. Having an in-between logical plan will allow us to run optimizers before the query is executed.

The logical plan would have a one to one mapping with the AST and will contain the parameters of each AST node.
Query optimizers can then transform the logical plan based on predefined heuristics. One example would be optimizing series selects in binary operations so that we do as few network calls as possible.

Finally, we would build the physical plan from the optimized logical plan instead doing it from the AST directly.
```

Here's our current query lifecycle

```mermaid
flowchart TD
Query["query string"]
AST["parser.Expr"]
Plan1["logicalplan.Plan"]
Plan2["logicalplan.Plan"]
Operator["model.VectorOperator"]

Query -->|parsring| AST
AST -->|logicalplan.New| Plan1
Plan1 -->|optimize| Plan2
Plan2 -->|execution.New| Operator
```

The `logicalplan.Plan` is just a wrapper of `parser.Expr`, and the conversion from `logicalplan.Plan` to `model.VectorOperator` is actually direct conversion from `parser.Expr` to `model.VectorOperator`.

Another point, is our optimizers are heuristic optimizers, hence it's could not optimize for some complex queries, and could not use the data statistic to perform the optimization

## Proposal

We will implement the 2-stages planner according to Volcano planner

```mermaid
flowchart TD
Query["query string"]
AST["parser.Expr"]
LogicalPlan1["LogicalPlan"]
Operator["model.VectorOperator"]

subgraph plan["Planner"]
subgraph explore["Exploration Phase"]
LogicalPlan2["LogicalPlan"]
end
subgraph implement["Implementation Phase"]
PhysicalPlan
end
end

Query -->|parsring| AST
AST -->|ast2plan| LogicalPlan1
LogicalPlan1 --> LogicalPlan2
LogicalPlan2 --> PhysicalPlan
PhysicalPlan --> Operator
```

### Exploration phase

The exploration phase, used to explore all possible transformation of the original logical plan

```mermaid
flowchart TD
LogicalPlan1["LogicalPlan"]
LogicalPlan2["LogicalPlan"]

LogicalPlan1 -->|fire transformation rules| LogicalPlan2
```

**Define**:
- `Group`: The `Equivalent Group`, or the `Equivalent Set`, is a group of multiple equivalent logical plans
- `GroupExpr`: representing a logical plan node (basically it's just wrap around the logical plan, with some additional information)

```go
type Group struct {
// logical
Equivalents map[ID]*GroupExpr // The equivalent expressions.
ExplorationMark
}

type GroupExpr struct {
Expr logicalplan.LogicalPlan // The logical plan bind to the expression.
Children []*Group // The children group of the expression, noted that it must be in the same order with LogicalPlan.Children().
AppliedTransformations utils.Set[TransformationRule]
ExplorationMark
}
```

Here is the interface of transformation rule

```go
type TransformationRule interface {
Match(expr *GroupExpr) bool // Check if the transformation can be applied to the expression
Transform(expr *GroupExpr) *GroupExpr // Transform the expression
}
```

```go
for _, rule := range rules {
if rule.Match(equivalentExpr) {
if !equivalentExpr.AppliedTransformations.Contains(rule) {
transformedExpr := rule.Transform(o.memo, equivalentExpr)
// add new equivalent expr to group
group.Equivalents[transformedExpr.ID] = transformedExpr
equivalentExpr.AppliedTransformations.Add(rule)
// reset group exploration state
transformedExpr.SetExplore(round, false)
group.SetExplore(round, false)
}
}
}
```

### Implementation phase

After exploration phase, we have the expanded logical plan (including the original plan and the transformed plans)

Then we will find the implementation which has the lowest implementation cost


```mermaid
flowchart TD
LogicalPlan2["LogicalPlan"]
PhysicalPlan
Operator["model.VectorOperator"]

LogicalPlan2 -->|find best implementation| PhysicalPlan
PhysicalPlan -->|get the actual implementation| Operator
```

The physical plan represent the actual implementation of a logical plan (the `Children` property in `PhysicalPlan` is used for cost calculation)

```go
type PhysicalPlan interface {
SetChildren(children []PhysicalPlan) // set child implementations, also update the operator and cost.
Children() []PhysicalPlan // Return the saved child implementations from the last CalculateCost call.
Operator() model.VectorOperator // Return the saved physical operator set from the last CalculateCost.
Cost() cost.Cost // Return the saved cost from the last CalculateCost call.
}
```

For each logical plan, we will have several implementations

```go
type ImplementationRule interface {
ListImplementations(expr *GroupExpr) []physicalplan.PhysicalPlan // List all implementation for the expression
}
```

And we will find the best implementation, via simple dynamic programming

```go
var possibleImpls []physicalplan.PhysicalPlan
for _, rule := range rules {
possibleImpls = append(possibleImpls, rule.ListImplementations(expr)...)
}
```

```go
var currentBest *memo.GroupImplementation
for _, impl := range possibleImpls {
impl.SetChildren(childImpls)
calculatedCost := impl.Cost()
if groupImpl != nil {
if costModel.IsBetter(currentBest.Cost, calculatedCost) {
currentBest.SelectedExpr = expr
currentBest.Implementation = impl
currentBest.Cost = calculatedCost
}
} else {
currentBest = &memo.GroupImplementation{
SelectedExpr: expr,
Cost: calculatedCost,
Implementation: impl,
}
}
}
```
10 changes: 10 additions & 0 deletions planner/cost/cost.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
package cost

type Cost struct {
CpuCost float64
MemoryCost float64
}

type CostModel interface {
IsBetter(currentCost Cost, newCost Cost) bool
}
74 changes: 74 additions & 0 deletions planner/logicalplan/ast2plan.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
package logicalplan

import "github.com/thanos-io/promql-engine/parser"

func NewLogicalPlan(expr *parser.Expr) LogicalPlan {
switch node := (*expr).(type) {
case *parser.StepInvariantExpr:
return &StepInvariantExpr{Expr: NewLogicalPlan(&node.Expr)}
case *parser.VectorSelector:
return &VectorSelector{
Name: node.Name,
OriginalOffset: node.OriginalOffset,
Offset: node.Offset,
Timestamp: node.Timestamp,
StartOrEnd: node.StartOrEnd,
LabelMatchers: node.LabelMatchers,
}
case *parser.MatrixSelector:
return &MatrixSelector{
VectorSelector: NewLogicalPlan(&node.VectorSelector),
Range: node.Range,
}
case *parser.AggregateExpr:
return &AggregateExpr{
Op: node.Op,
Expr: NewLogicalPlan(&node.Expr),
Param: NewLogicalPlan(&node.Param),
Grouping: node.Grouping,
Without: node.Without,
}
case *parser.Call:
var args []LogicalPlan
for i := range node.Args {
args = append(args, NewLogicalPlan(&node.Args[i]))
}
return &Call{
Func: node.Func,
Args: args,
}
case *parser.BinaryExpr:
return &BinaryExpr{
Op: node.Op,
LHS: NewLogicalPlan(&node.LHS),
RHS: NewLogicalPlan(&node.RHS),
VectorMatching: node.VectorMatching,
ReturnBool: node.ReturnBool,
}
case *parser.UnaryExpr:
return &UnaryExpr{
Op: node.Op,
Expr: NewLogicalPlan(&node.Expr),
}
case *parser.ParenExpr:
return &ParenExpr{
Expr: NewLogicalPlan(&node.Expr),
}
case *parser.SubqueryExpr:
return &SubqueryExpr{
Expr: NewLogicalPlan(&node.Expr),
Range: node.Range,
OriginalOffset: node.OriginalOffset,
Offset: node.Offset,
Timestamp: node.Timestamp,
StartOrEnd: node.StartOrEnd,
Step: node.Step,
}
// literal types
case *parser.NumberLiteral:
return &NumberLiteral{Val: node.Val}
case *parser.StringLiteral:
return &StringLiteral{Val: node.Val}
}
return nil // should never reach here
}
81 changes: 81 additions & 0 deletions planner/logicalplan/ast2plan_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
package logicalplan

import (
"github.com/stretchr/testify/require"
"github.com/thanos-io/promql-engine/parser"
"math"
"testing"
)

var ast2planTestCases = []struct {
input parser.Expr // The AST input.
expected LogicalPlan // The expected logical plan.
}{
{
input: &parser.NumberLiteral{Val: 1},
expected: &NumberLiteral{Val: 1},
},
{
input: &parser.NumberLiteral{Val: math.Inf(1)},
expected: &NumberLiteral{Val: math.Inf(1)},
},
{
input: &parser.NumberLiteral{Val: math.Inf(-1)},
expected: &NumberLiteral{Val: math.Inf(-1)},
},
{
input: &parser.BinaryExpr{
Op: parser.ADD,
LHS: &parser.NumberLiteral{Val: 1},
RHS: &parser.NumberLiteral{Val: 1},
},
expected: &BinaryExpr{
Op: parser.ADD,
LHS: &NumberLiteral{Val: 1},
RHS: &NumberLiteral{Val: 1},
},
},
{
input: &parser.BinaryExpr{
Op: parser.ADD,
LHS: &parser.NumberLiteral{Val: 1},
RHS: &parser.BinaryExpr{
Op: parser.DIV,
LHS: &parser.NumberLiteral{Val: 2},
RHS: &parser.ParenExpr{
Expr: &parser.BinaryExpr{
Op: parser.MUL,
LHS: &parser.NumberLiteral{Val: 3},
RHS: &parser.NumberLiteral{Val: 1},
},
},
},
},
expected: &BinaryExpr{
Op: parser.ADD,
LHS: &NumberLiteral{Val: 1},
RHS: &BinaryExpr{
Op: parser.DIV,
LHS: &NumberLiteral{Val: 2},
RHS: &ParenExpr{
Expr: &BinaryExpr{
Op: parser.MUL,
LHS: &NumberLiteral{Val: 3},
RHS: &NumberLiteral{Val: 1},
},
},
},
},
},
// TODO add tests
}

func TestAST2Plan(t *testing.T) {
for _, test := range ast2planTestCases {
t.Run(test.input.String(), func(t *testing.T) {
plan := NewLogicalPlan(&test.input)
require.True(t, plan != nil, "could not convert AST to logical plan")
require.Equal(t, test.expected, plan, "error on input '%s'", test.input.String())
})
}
}
Loading