You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After the changes in #5490, comparisons formerly expressed in the Zed language like san.ip==null are now expressed SQL style like san is null. However, I noticed that if it appears in a search rather than a where, the comparison using the SQL approach isn't treated as a binary expression, and this had some unexpected side effects for me.
Details
Repro is with super commit 95e746d, which is associated with the changes in #5490.
This repro is a simplification of an Autoperf query. As the original query was more complex and the data queried consisted of millions of large records across tens-of-GB of input data, as a user it was much more difficult to figure out what was going on than shown in the repro below.
Though of course I could have also been explicit and done either of these to get the same result:
$ zq 'where _path=="x509" and san.ip==null' data.json
{_path:"x509",id:"FnUlL92kZIhBUyfHZb",f:"This field",san:{ip:null}}
$ zq 'search _path=="x509" and san.ip==null' data.json
{_path:"x509",id:"FnUlL92kZIhBUyfHZb",f:"This field",san:{ip:null}}
That is, like I expect is true for many of our users, I've traditionally been lazy about noticing when implied operators turn my "searches" into where or search, since I've always been able to start with a binary expression like the one shown above (which turns into a where), tack on something like a regexp (which will turn it into a search), and the parser makes the necessary adjustments for me so it works as expected.
$ zq '/also/ or (_path=="x509" and san.ip==null)' data.json
{_path:"x509",id:"FnUlL92kZIhBUyfHZb",f:"This field",san:{ip:null}}
{_path:"x509",id:"FnUlL92kZIhBUyfHZb",f:"This is also field",san:{ip:"foo"}}
In the SuperDB era at commit 95e746d (which is associated with the changes in #5490), If we start from the same query, it no longer parses.
$ super -version
Version: v1.18.0-161-g95e746d3
$ super -c '_path=="x509" san.ip==null' data.json
parse error at line 1, column 15:
_path=="x509" san.ip==null
=== ^ ===
There's two things in particular that may need to be adjusted:
If we do just the first change, we get no output at all. FWIW, SQL still allows for the ==null comparison but it just doesn't behave the same as it did in Zed (blah!).
$ super -c '? _path=="x509" san.ip==null' data.json
[no output]
And if we do just the second change, it still doesn't parse.
$ super -c '_path=="x509" san.ip is null' data.json
parse error at line 1, column 15:
_path=="x509" san.ip is null
=== ^ ===
Therefore, acting as a fully educated user (i.e., let's assume I read some release notes for the Zed->SuperDB transition), I do both changes together. But the query result is not the same as in the Zed era.
$ super -c '? _path=="x509" san.ip is null' data.json
{_path:"x509",id:"FnUlL92kZIhBUyfHZb",f:"This field",san:{ip:null}}
{_path:"x509",id:"FnUlL92kZIhBUyfHZb",f:"This is also field",san:{ip:"foo"}}
What seems to be happening is that these are now being treated as separate search terms, e.g.,
The san.ip matches against the presence of the field key san.ip.
The is matches against strings like This and is in the contents of field f.
There's a case-insensitive string match on the nUlL in the id field.
If I was less lazy and hip to the fact that this all could be expressed as a where, I'd get my previous result (though I also had to now add the explicit and, per docs).
$ super -c 'where _path=="x509" and san.ip is null' data.json
{_path:"x509",id:"FnUlL92kZIhBUyfHZb",f:"This field",san:{ip:null}}
Overall, however, my guess is that we probably intended the is null (or is not null) to still be treated as a binary expression even in these searches and hence avoid the surprise here. I think this would allow us to give guidance to users to just add ? to the start of implied-searches-that-worked-in-Zed and still get the same expected results.
The text was updated successfully, but these errors were encountered:
tl;dr
After the changes in #5490, comparisons formerly expressed in the Zed language like
san.ip==null
are now expressed SQL style likesan is null
. However, I noticed that if it appears in asearch
rather than awhere
, the comparison using the SQL approach isn't treated as a binary expression, and this had some unexpected side effects for me.Details
Repro is with
super
commit 95e746d, which is associated with the changes in #5490.This repro is a simplification of an Autoperf query. As the original query was more complex and the data queried consisted of millions of large records across tens-of-GB of input data, as a user it was much more difficult to figure out what was going on than shown in the repro below.
Take this sample
data.json
:With the last GA Zed
v1.18.0
, isolating the record with thenull
value forsan.ip
looked like the following:Though of course I could have also been explicit and done either of these to get the same result:
That is, like I expect is true for many of our users, I've traditionally been lazy about noticing when implied operators turn my "searches" into
where
orsearch
, since I've always been able to start with a binary expression like the one shown above (which turns into awhere
), tack on something like a regexp (which will turn it into asearch
), and the parser makes the necessary adjustments for me so it works as expected.In the SuperDB era at commit 95e746d (which is associated with the changes in #5490), If we start from the same query, it no longer parses.
There's two things in particular that may need to be adjusted:
?
to express this as a searchis
rather than==
If we do just the first change, we get no output at all. FWIW, SQL still allows for the
==null
comparison but it just doesn't behave the same as it did in Zed (blah!).And if we do just the second change, it still doesn't parse.
Therefore, acting as a fully educated user (i.e., let's assume I read some release notes for the Zed->SuperDB transition), I do both changes together. But the query result is not the same as in the Zed era.
What seems to be happening is that these are now being treated as separate search terms, e.g.,
san.ip
matches against the presence of the field keysan.ip
.is
matches against strings likeThis
andis
in the contents of fieldf
.nUlL
in theid
field.If I was less lazy and hip to the fact that this all could be expressed as a
where
, I'd get my previous result (though I also had to now add the explicitand
, per docs).Overall, however, my guess is that we probably intended the
is null
(oris not null
) to still be treated as a binary expression even in these searches and hence avoid the surprise here. I think this would allow us to give guidance to users to just add?
to the start of implied-searches-that-worked-in-Zed and still get the same expected results.The text was updated successfully, but these errors were encountered: