-
Notifications
You must be signed in to change notification settings - Fork 67
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement vector non-predicate search terms (#5521)
Specifically, this includes regular expression, glob, keyword, and literal search terms (i.e., dag.Search and dag.RegexpSearch).
- Loading branch information
Showing
25 changed files
with
228 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '? "foo\"bar"' | ||
|
||
vector: true | ||
|
||
input: | | ||
{s:"foo\"bar"} | ||
{s:"foobar"} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '? *1' | ||
|
||
vector: true | ||
|
||
input: | | ||
"1" | ||
"a1" | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,8 @@ | |
# longer doing this | ||
zed: '? x==1 inaction' | ||
|
||
vector: true | ||
|
||
input: | | ||
{x:1,text:"inaction"} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '? \"foo' | ||
|
||
vector: true | ||
|
||
input: | | ||
{s:"foo"} | ||
{s:"\"foo"} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '? grep("a") grep("b")' | ||
|
||
vector: true | ||
|
||
input: | | ||
{s1:"a",s2:"b"} | ||
{s1:"b",s2:"a"} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: grep("a") | ||
|
||
vector: true | ||
|
||
input: | | ||
{s1:"a",s2:"b"} | ||
{s1:"b",s2:"a"} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '?bar' | ||
|
||
vector: true | ||
|
||
input: | | ||
{s1:"foo",s2:"bar"} | ||
{s1:"foo",s2:null(string)} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '? bjørndal' | ||
|
||
vector: true | ||
|
||
input: '"bjørndal"' | ||
|
||
output: | | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: yield grep(*foo*) | ||
|
||
vector: true | ||
|
||
input: | | ||
"foo" | ||
1 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
zed: | | ||
? /.*/ | ||
vector: true | ||
|
||
input: &input | | ||
"a" | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
zed: | | ||
? <{x:int64}> | ||
vector: true | ||
|
||
input: <int64> <string> <{x:int64}> | ||
|
||
output: | | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
package expr | ||
|
||
import ( | ||
"net/netip" | ||
"regexp" | ||
"slices" | ||
"unsafe" | ||
|
||
"github.com/brimdata/super" | ||
"github.com/brimdata/super/runtime/sam/expr" | ||
"github.com/brimdata/super/vector" | ||
) | ||
|
||
type search struct { | ||
e Evaluator | ||
vectorPred func(vector.Any) vector.Any | ||
stringPred func([]byte) bool | ||
fnm *expr.FieldNameMatcher | ||
} | ||
|
||
func NewSearch(s string, val super.Value, e Evaluator) Evaluator { | ||
stringPred := func(b []byte) bool { | ||
return expr.StringContainsFold(string(b), s) | ||
} | ||
var net netip.Prefix | ||
if val.Type().ID() == super.IDNet { | ||
net = super.DecodeNet(val.Bytes()) | ||
} | ||
eq := NewCompare(super.NewContext() /* XXX */, nil, nil, "==") | ||
vectorPred := func(vec vector.Any) vector.Any { | ||
if net.IsValid() && vector.KindOf(vec) == vector.KindIP { | ||
out := vector.NewBoolEmpty(vec.Len(), nil) | ||
for i := range vec.Len() { | ||
if ip, null := vector.IPValue(vec, i); !null && net.Contains(ip) { | ||
out.Set(i) | ||
} | ||
} | ||
return out | ||
} | ||
return eq.eval(vec, vector.NewConst(val, vec.Len(), nil)) | ||
} | ||
return &search{e, vectorPred, stringPred, nil} | ||
} | ||
|
||
func NewSearchRegexp(re *regexp.Regexp, e Evaluator) Evaluator { | ||
return &search{e, nil, re.Match, expr.NewFieldNameMatcher(re.Match)} | ||
} | ||
|
||
func NewSearchString(s string, e Evaluator) Evaluator { | ||
pred := func(b []byte) bool { | ||
return expr.StringContainsFold(string(b), s) | ||
} | ||
return &search{e, nil, pred, expr.NewFieldNameMatcher(pred)} | ||
} | ||
|
||
func (s *search) Eval(this vector.Any) vector.Any { | ||
return vector.Apply(true, s.eval, s.e.Eval(this)) | ||
} | ||
|
||
func (s *search) eval(vecs ...vector.Any) vector.Any { | ||
vec := vector.Under(vecs[0]) | ||
typ := vec.Type() | ||
if s.fnm != nil && s.fnm.Match(typ) { | ||
return vector.NewConst(super.True, vec.Len(), nil) | ||
} | ||
if typ.Kind() == super.PrimitiveKind { | ||
return s.match(vec) | ||
} | ||
n := vec.Len() | ||
var index []uint32 | ||
if view, ok := vec.(*vector.View); ok { | ||
vec = view.Any | ||
index = view.Index | ||
} | ||
switch vec := vec.(type) { | ||
case *vector.Record: | ||
out := vector.NewBoolEmpty(n, nil) | ||
for _, f := range vec.Fields { | ||
if index != nil { | ||
f = vector.NewView(f, index) | ||
} | ||
out = vector.Or(out, toBool(s.eval(f))) | ||
} | ||
return out | ||
case *vector.Array: | ||
return s.evalForList(vec.Values, vec.Offsets, index, n) | ||
case *vector.Set: | ||
return s.evalForList(vec.Values, vec.Offsets, index, n) | ||
case *vector.Map: | ||
return vector.Or(s.evalForList(vec.Keys, vec.Offsets, index, n), | ||
s.evalForList(vec.Values, vec.Offsets, index, n)) | ||
case *vector.Union: | ||
return vector.Apply(true, s.eval, vec) | ||
case *vector.Error: | ||
return s.eval(vec.Vals) | ||
} | ||
panic(vec) | ||
} | ||
|
||
func (s *search) evalForList(vec vector.Any, offsets, index []uint32, length uint32) *vector.Bool { | ||
out := vector.NewBoolEmpty(length, nil) | ||
var index2 []uint32 | ||
for j := range length { | ||
if index != nil { | ||
j = index[j] | ||
} | ||
start, end := offsets[j], offsets[j+1] | ||
if start == end { | ||
continue | ||
} | ||
n := end - start | ||
index2 = slices.Grow(index2[:0], int(n))[:n] | ||
for k := range n { | ||
index2[k] = k + start | ||
} | ||
view := vector.NewView(vec, index2) | ||
if toBool(s.eval(view)).TrueCount() > 0 { | ||
out.Set(j) | ||
} | ||
} | ||
return out | ||
} | ||
|
||
func (s *search) match(vec vector.Any) vector.Any { | ||
if vec.Type().ID() == super.IDString { | ||
out := vector.NewBoolEmpty(vec.Len(), nil) | ||
for i := range vec.Len() { | ||
str, null := vector.StringValue(vec, i) | ||
// Prevent compiler from copying str, which it thinks | ||
// escapes to the heap because stringPred is a pointer. | ||
bytes := unsafe.Slice(unsafe.StringData(str), len(str)) | ||
if !null && s.stringPred(bytes) { | ||
out.Set(i) | ||
} | ||
} | ||
return out | ||
} | ||
if s.vectorPred != nil { | ||
return s.vectorPred(vec) | ||
} | ||
return vector.NewConst(super.False, vec.Len(), nil) | ||
} |
2 changes: 2 additions & 0 deletions
2
.../expr/ztests/filter-escaped-asterisk.yaml → .../ztests/expr/search-escaped-asterisk.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '? A\=\*' | ||
|
||
vector: true | ||
|
||
input: | | ||
{s:"A=B"} | ||
{s:"A=*"} | ||
|
2 changes: 2 additions & 0 deletions
2
...xpr/ztests/filter-escaped-equal-sign.yaml → ...tests/expr/search-escaped-equal-sign.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '? A\=B' | ||
|
||
vector: true | ||
|
||
input: | | ||
{s:"A=B"} | ||
{s:"A=*"} | ||
|
2 changes: 2 additions & 0 deletions
2
runtime/sam/expr/ztests/glob.yaml → runtime/ztests/expr/search-glob.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
zed: '? foo*' | ||
|
||
vector: true | ||
|
||
input: | | ||
{a:"hello",b:"there"} | ||
{a:"foox",b:"there"} | ||
|
Oops, something went wrong.