Skip to content

Latest commit

 

History

History
66 lines (54 loc) · 2.07 KB

summarize.md

File metadata and controls

66 lines (54 loc) · 2.07 KB

Operator

summarize — perform aggregations

Synopsis

summarize [<field>:=]<agg> [where <expr>][, [<field>:=]<agg> [where <expr>]] [by [<field>][:=<expr>] ...]

Description

The summarize operator consumes all of its input, applies an aggregate function to each input value optionally organized with the group-by keys specified after the by keyword and at the end of input, and produces one or more aggregations for each unique set of group-by key values.

Each aggregate function may be optionally followed by a where clause, which indicates a Boolean expression that indicates, for each input value, whether to deliver it to that aggregate.

The output field names for each aggregate and each key are optional. If omitted, a field name is inferred from each right-hand side, e.g, the output field for the sum aggregate function is simply sum.

A key may be either an expression or a field. If the key field is omitted it, it is inferred from the expression, e.g., the field name for by lower(s) is lower.

If the cardinality of group-by keys causes the memory footprint to exceed a limit, then each aggregate's partial results are spilled to temporary storage and the results merged into final results using an external merge sort. The same mechanism that spills to storage can also spill across the network to a cluster of workers in an adaptive shuffle, though this is not yet implemented.

Examples

Sum the input sequence:

echo '1 2 3 4' | zq -z 'sum(this)' -

=>

{sum:10}

Create integer sets by key and sort the output to get a deterministic order:

echo '{k:"foo",v:1}{k:"bar",v:2}{k:"foo",v:3}{k:"baz",v:4}' | zq -z 'set:=union(v) by key:=k' - | sort

=>

{key:"bar",set:|[2]|}
{key:"baz",set:|[4]|}
{key:"foo",set:|[1,3]|}

Use a where clause

echo '{k:"foo",v:1}{k:"bar",v:2}{k:"foo",v:3}{k:"baz",v:4}' | zq -z 'set:=union(v) where v > 1 by key:=k' - | sort

=>

{key:"bar",set:|[2]|}
{key:"baz",set:|[4]|}
{key:"foo",set:|[3]|}