Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug in lake scan when merging by non-ts pool key #2752

Merged
merged 1 commit into from
May 21, 2021
Merged

Conversation

mccanne
Copy link
Collaborator

@mccanne mccanne commented May 20, 2021

closes #2749

@mccanne mccanne requested a review from a team May 20, 2021 23:26
if len(pullers) == 1 {
merger = pullers[0]
} else if len(keys) > 0 && len(keys[0]) == 1 && keys[0][0] == "ts" {
merger = zbuf.MergeByTs(ctx, pullers, pool.Layout.Order)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're eventually going to have to address the fact that zbuf.MergeByTs doesn't do the right thing for records with a non-time ts field.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can easily be part of the JIT-CG. The scan scheduler can look at the metadata across a partition to see inspect the type signature(s) for the pool key.

@philrz
Copy link
Contributor

philrz commented May 20, 2021

FWIW, I just tested out the branch using the original repro steps and it looks good to me too.

$ zed -version
Version: v0.29.0-350-gce2003e6

$ rm -rf $ZED_LAKE_ROOT

$ zed lake init
using environment variable ZED_LAKE_ROOT
lake created: file:///Users/phil/logs

$ zed lake create -S 5MB -p logs -orderby id.resp_h:asc
pool created: logs

$ zed lake load -p logs ~/work/zed-sample-data/zeek-default/conn.log.gz
1sozz07rJ7a6VZ3SIRwdLALOJGj committed 17 segments

$ zed lake query -z 'from logs | cut id.resp_h'
{id:{resp_h:2.22.230.64}}
{id:{resp_h:2.22.230.64}}
{id:{resp_h:2.22.230.64}}
{id:{resp_h:4.53.160.75}}
{id:{resp_h:4.53.160.75}}
{id:{resp_h:4.53.160.75}}
{id:{resp_h:4.53.160.75}}
{id:{resp_h:5.9.78.71}}
{id:{resp_h:5.9.102.20}}
...

$ zed lake query -z 'from logs | cut id.resp_h' > /tmp/ip-query

$ zed lake query -z 'from logs | cut id.resp_h | sort' > /tmp/ip-query-sorted

$ diff /tmp/ip-query /tmp/ip-query-sorted
$ echo $?
0

@philrz philrz self-requested a review May 20, 2021 23:43
@mccanne mccanne merged commit eee3982 into main May 21, 2021
@mccanne mccanne deleted the fix-lake-merge branch May 21, 2021 01:04
brim-bot pushed a commit to brimdata/brimcap that referenced this pull request May 21, 2021
…key" by mccanne

This is an auto-generated commit with a Zed dependency update. The Zed PR
brimdata/super#2752, authored by @mccanne,
has been merged.

fix bug in lake scan when merging by non-ts pool key

closes brimdata/super#2749
brim-bot pushed a commit to brimdata/brimcap that referenced this pull request May 21, 2021
…key" by mccanne

This is an auto-generated commit with a Zed dependency update. The Zed PR
brimdata/super#2752, authored by @mccanne,
has been merged.

fix bug in lake scan when merging by non-ts pool key

closes brimdata/super#2749
brim-bot pushed a commit to brimdata/zui that referenced this pull request May 21, 2021
…key" by mccanne

This is an auto-generated commit with a Zed dependency update. The Zed PR
brimdata/super#2752, authored by @mccanne,
has been merged.

fix bug in lake scan when merging by non-ts pool key

closes brimdata/super#2749
@philrz philrz linked an issue May 21, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants