Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement data source interfaces & registration in rego engine #4997

Merged
merged 4 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions internal/engine/eval/rego/datasources.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
// SPDX-FileCopyrightText: Copyright 2023 The Minder Authors
// SPDX-License-Identifier: Apache-2.0

package rego

import (
"fmt"
"strings"

"github.com/open-policy-agent/opa/ast"
"github.com/open-policy-agent/opa/rego"
"github.com/open-policy-agent/opa/types"

v1datasources "github.com/mindersec/minder/pkg/datasources/v1"
)

// RegisterDataSources implements the Eval interface.
func (e *Evaluator) RegisterDataSources(dsr *v1datasources.DataSourceRegistry) {
for key, dsf := range dsr.GetFuncs() {
fmt.Printf("Registering data source %s\n", key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to leave in a Printf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not

e.regoOpts = append(e.regoOpts, buildFromDataSource(key, dsf))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: does it make sense for dsf from dsr.GetFuncs() to actually have a func(*rego.Rego) method (i.e. to move the buildFromDataSource into a function on DataSourceFuncDef)?

It pushes the Rego types into a slighly larger area of the code, but it helps avoid having a big ball of code in eval/rego (and if we name the method Rego or RegoOpt, we would have a pattern for any other future evaluation engines).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to have data sources to be quite independent from the eval engine. The eval engine would then be in charge of interpreting the data source and making sense of it. It brings a lot more responsibility into the eval engine, but it allows for us to include more engines with ease in the future. e.g. if we'd introduce a WASM engine, we wouldn't need to mess around with rego structures.

}
}

// buildFromDataSource builds a rego function from a data source function.
// It takes a DataSourceFuncDef and returns a function that can be used to
// register the function with the rego engine.
func buildFromDataSource(key v1datasources.DataSourceFuncKey, dsf v1datasources.DataSourceFuncDef) func(*rego.Rego) {
k := normalizeKey(key)
return rego.Function1(
&rego.Function{
Name: k,
Decl: types.NewFunction(types.Args(types.A), types.A),
},
func(_ rego.BuiltinContext, obj *ast.Term) (*ast.Term, error) {
// Convert the AST value back to a Go interface{}
jsonObj, err := ast.JSON(obj.Value)
if err != nil {
return nil, err
}

if err := dsf.ValidateArgs(obj); err != nil {
return nil, err
}

// Call the data source function
ret, err := dsf.Call(jsonObj)
Comment on lines +42 to +47
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you move buildFromDataSource to DataSourceFuncDef, you might not need to expose ValidateArgs or Call in the interface.

if err != nil {
return nil, err
}

val, err := ast.InterfaceToValue(ret)
if err != nil {
return nil, err
}

return ast.NewTerm(val), nil
},
)
}

// This converts the data source function key into a format that can be used in the rego query.
// For example, if the key is "aws.ec2.instances", it will
// be converted to "minder.data.aws.ec2.instances".
// It also normalizes the key to lowercase (which should have already been done)
// and converts any "-" to "_", finally it removes any special characters.
func normalizeKey(key v1datasources.DataSourceFuncKey) string {
low := strings.ToLower(key.String())
underscore := strings.ReplaceAll(low, "-", "_")
// Remove any special characters
norm := strings.Map(func(r rune) rune {
if r >= 'a' && r <= 'z' || r >= '0' && r <= '9' || r == '_' || r == '.' {
return r
}
return -1
}, underscore)
return fmt.Sprintf("minder.datasource.%s", norm)
}
60 changes: 60 additions & 0 deletions internal/engine/eval/rego/rego_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,14 @@ import (

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"go.uber.org/mock/gomock"

engerrors "github.com/mindersec/minder/internal/engine/errors"
"github.com/mindersec/minder/internal/engine/eval/rego"
"github.com/mindersec/minder/internal/engine/options"
minderv1 "github.com/mindersec/minder/pkg/api/protobuf/go/minder/v1"
v1datasources "github.com/mindersec/minder/pkg/datasources/v1"
v1mockds "github.com/mindersec/minder/pkg/datasources/v1/mock"
"github.com/mindersec/minder/pkg/engine/v1/interfaces"
)

Expand Down Expand Up @@ -454,3 +458,59 @@ violations[{"msg": msg}] {
&interfaces.Result{Object: map[string]any{}})
assert.Error(t, err, "should have failed to evaluate")
}

func TestCustomDatasourceRegister(t *testing.T) {
t.Parallel()

ctrl := gomock.NewController(t)

fds := v1mockds.NewMockDataSource(ctrl)
fdsf := v1mockds.NewMockDataSourceFuncDef(ctrl)

fds.EXPECT().GetFuncs().Return(map[v1datasources.DataSourceFuncKey]v1datasources.DataSourceFuncDef{
"source": fdsf,
}).AnyTimes()

fdsf.EXPECT().ValidateArgs(gomock.Any()).Return(nil).AnyTimes()

fdsr := v1datasources.NewDataSourceRegistry()

err := fdsr.RegisterDataSource("fake", fds)
require.NoError(t, err, "could not register data source")

e, err := rego.NewRegoEvaluator(
&minderv1.RuleType_Definition_Eval_Rego{
Type: rego.DenyByDefaultEvaluationType.String(),
Def: `
package minder

default allow = false

allow {
minder.datasource.fake.source({"datasourcetest": input.ingested.data}) == "foo"
}`,
},
options.WithDataSources(fdsr),
)
require.NoError(t, err, "could not create evaluator")

emptyPol := map[string]any{}

// Matches
fdsf.EXPECT().Call(gomock.Any()).Return("foo", nil)
err = e.Eval(context.Background(), emptyPol, nil, &interfaces.Result{
Object: map[string]any{
"data": "foo",
},
})
require.NoError(t, err, "could not evaluate")

// Doesn't match
fdsf.EXPECT().Call(gomock.Any()).Return("bar", nil)
err = e.Eval(context.Background(), emptyPol, nil, &interfaces.Result{
Object: map[string]any{
"data": "bar",
},
})
require.ErrorIs(t, err, engerrors.ErrEvaluationFailed, "should have failed the evaluation")
Comment on lines +510 to +515
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would make sense to use a real registry and DataSource here, to provide an example for future users (and to verify the lifecycle more deeply than Mocks).

(You'll also get a bit more test coverage of registry.go for "free")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a good point, I'll add a testing implementation

}
21 changes: 21 additions & 0 deletions internal/engine/options/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ package options
import (
"github.com/open-feature/go-sdk/openfeature"

v1datasources "github.com/mindersec/minder/pkg/datasources/v1"
"github.com/mindersec/minder/pkg/engine/v1/interfaces"
)

Expand All @@ -33,3 +34,23 @@ func WithFlagsClient(client openfeature.IClient) Option {
return inner.SetFlagsClient(client)
}
}

// SupportsDataSources interface advertises the fact that the implementer
// can register data sources with the evaluator.
type SupportsDataSources interface {
RegisterDataSources(ds *v1datasources.DataSourceRegistry)
}

// WithDataSources provides the evaluation engine with a list of data sources
// to register. In case the given evaluator does not support data sources,
// WithDataSources silently ignores the error.
func WithDataSources(ds *v1datasources.DataSourceRegistry) Option {
return func(e interfaces.Evaluator) error {
inner, ok := e.(SupportsDataSources)
if !ok {
return nil
}
inner.RegisterDataSources(ds)
return nil
}
}
44 changes: 44 additions & 0 deletions pkg/datasources/v1/datasources.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
// SPDX-FileCopyrightText: Copyright 2024 The Minder Authors
// SPDX-License-Identifier: Apache-2.0

// Package v1 provides the interfaces and types for the data sources.
package v1

//go:generate go run go.uber.org/mock/mockgen -package mock_$GOPACKAGE -destination=./mock/$GOFILE -source=./$GOFILE

// DataSourceFuncKey is the key that uniquely identifies a data source function.
type DataSourceFuncKey string

// String returns the string representation of the data source function key.
func (k DataSourceFuncKey) String() string {
return string(k)
}

// DataSourceFuncDef is the definition of a data source function.
// It contains the key that uniquely identifies the function and the arguments
// that the function can take.
type DataSourceFuncDef interface {
// ValidateArgs validates the arguments of the function.
ValidateArgs(obj any) error
// ValidateUpdate validates the update to the data source.
// The data source implementation should respect the update and return an error
// if the update is invalid.
ValidateUpdate(obj any) error
// Call calls the function with the given arguments.
// It is the responsibility of the data source implementation to handle the call.
// It is also the responsibility of the caller to validate the arguments
// before calling the function.
Call(args any) (any, error)
}

// DataSource is the interface that a data source must implement.
// It implements several functions that will be used by the engine to
// interact with external systems. These get taken into used by the Evaluator.
// Moreover, a data source must be able to validate an update to itself.
type DataSource interface {
// Returns the registered name of the data source.
GetName() string

// GetFuncs returns the functions that the data source provides.
GetFuncs() map[DataSourceFuncKey]DataSourceFuncDef
}
136 changes: 136 additions & 0 deletions pkg/datasources/v1/mock/datasources.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading