Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Super-fy the Super Columnar format doc #5399

Merged
merged 1 commit into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@
> [specific guidance for users of the Zed CLI tools](https://github.com/brimdata/zed-lake-migration#zed-cli-tools).

* Zed lake storage format is now at version 3 (#4386, #4415)
* Allow loading and responses in [VNG](docs/formats/vng.md) format over the lake API (#4345)
* Allow loading and responses in [VNG](docs/formats/csup.md) format over the lake API (#4345)
* Fix an issue where [record spread expressions](docs/language/expressions.md#record-expressions) could cause a crash (#4359)
* Fix an issue where the Zed service `/version` endpoint returned "unknown" if it had been built via `go install` (#4371)
* Branch-level [meta-queries](docs/commands/zed.md#meta-queries) on the `main` branch no longer require an explicit `@main` reference (#4377, #4394)
Expand All @@ -177,7 +177,7 @@

## v1.5.0
* Add `float16` primitive type (#4301)
* Add segment compression to the [VNG](docs/formats/vng.md) format (#4299)
* Add segment compression to the [VNG](docs/formats/csup.md) format (#4299)
* Add `-unbuffered` flag to `zed` and `zq` (#4320)
* Add `-csv.delim` flag to `zed` and `zq` for reading CSV with non-comma delimiter (#4325)
* Add `csv.delim` query parameter to lake API for reading CSV with non-comma delimiter (#4333)
Expand All @@ -186,7 +186,7 @@
* Fix an issue where type decorators of union values were leaking into CSV output (#4338)

## v1.4.0
* The ZST format is now called [VNG](docs/formats/vng.md) (#4256)
* The ZST format is now called [VNG](docs/formats/csup.md) (#4256)
* Allow loading of "line" format over the lake API (#4229)
* Allow loading of Parquet format over the lake API (#4235)
* Allow loading of Zeek TSV format over the lake API (#4246)
Expand Down Expand Up @@ -629,7 +629,7 @@ questions.
## v0.23.0
* zql: Add `week` as a unit for [time grouping with `every`](docs/language/functions/every.md) (#1374)
* zq: Fix an issue where a `null` value in a [JSON type definition](docs/integrations/zeek/README.md) caused a failure without an error message (#1377)
* zq: Add [`zst` format](docs/formats/vng.md) to `-i` and `-f` command-line help (#1384)
* zq: Add [`zst` format](docs/formats/csup.md) to `-i` and `-f` command-line help (#1384)
* zq: ZNG spec and `zq` updates to introduce the beta ZNG storage format (#1375, #1415, #1394, #1457, #1512, #1523, #1529), also addressing the following:
* New data type `bytes` for storing sequences of bytes encoded as base64 (#1315)
* Improvements to the `enum` data type (#1314)
Expand Down Expand Up @@ -693,7 +693,7 @@ questions.
* zqd: Fix an issue where starting `zqd listen` created excess error messages when subdirectories were present (#1303)
* zql: Add the [`fuse` operator](docs/language/operators/fuse.md) for unifying records under a single schema (#1310, #1319, #1324)
* zql: Fix broken links in documentation (#1321, #1339)
* zst: Introduce the [ZST format](docs/formats/vng.md) for columnar data based on ZNG (#1268, #1338)
* zst: Introduce the [ZST format](docs/formats/csup.md) for columnar data based on ZNG (#1268, #1338)
* pcap: Fix an issue where certain pcapng files could fail import with a `bad option length` error (#1341)
* zql: [Document the `**` operator](docs/language/README.md#search-syntax) for type-specific searches that look within nested records (#1337)
* zar: Change the archive data file layout to prepare for handing chunk files with overlapping ranges and improved S3 support (#1330)
Expand Down
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ that underlie the super-structured data formats.
* The [super data formats](formats/README.md) are a family of
[human-readable (Super JSON, JSUP)](formats/jsup.md),
[sequential (Super Binary, BSUP)](formats/bsup.md), and
[columnar (Super Columnar, CSUP)](formats/vng.md) formats that all adhere to the
[columnar (Super Columnar, CSUP)](formats/csup.md) formats that all adhere to the
same abstract super data model.
* The [SuperPipe language](language/README.md) is the system's pipeline language for performing
queries, searches, analytics, transformations, or any of the above combined together.
Expand Down
2 changes: 1 addition & 1 deletion docs/commands/zed.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ replication easy to support and deploy.
The cloud objects that comprise a lake, e.g., data objects,
commit history, transaction journals, partial aggregations, etc.,
are stored as Zed data, i.e., either as [row-based Super Binary](../formats/bsup.md)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zed?

or [columnar VNG](../formats/vng.md).
or [Super Columnar](../formats/csup.md).
This makes introspection of the lake structure straightforward as many key
lake data structures can be queried with metadata queries and presented
to a client as Zed data for further processing by downstream tooling.
Expand Down
6 changes: 3 additions & 3 deletions docs/commands/zq.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ Note here that the query `1+1` [implies](../language/pipeline-model.md#implied-o
| `line` | no | One string value per input line |
| `parquet` | yes | [Apache Parquet](https://github.com/apache/parquet-format) |
| `tsv` | yes | [TSV - Tab-Separated Values](https://en.wikipedia.org/wiki/Tab-separated_values) |
| `vng` | yes | [VNG - Binary Columnar Format](../formats/vng.md) |
| `csup` | yes | [Super Columnar](../formats/csup.md) |
| `zeek` | yes | [Zeek Logs](https://docs.zeek.org/en/master/logs/index.html) |
| `zjson` | yes | [ZJSON - Zed over JSON](../formats/zjson.md) |
| `bsup` | yes | [Super Binary](../formats/bsup.md) |
Expand Down Expand Up @@ -158,7 +158,7 @@ JSON any number that appears without a decimal point as an integer type.

:::tip note
The reason `zq` is not particularly performant for ZSON is that the ZNG or
[VNG](../formats/vng.md) formats are semantically equivalent to ZSON but much more efficient and
[Super Columnar](../formats/csup.md) formats are semantically equivalent to ZSON but much more efficient and
the design intent is that these efficient binary formats should be used in
use cases where performance matters. ZSON is typically used only when
data needs to be human-readable in interactive settings or in automated tests.
Expand Down Expand Up @@ -186,7 +186,7 @@ typically omit quotes around field names.
| `table` | (described [below](#simplified-text-outputs)) |
| `text` | (described [below](#simplified-text-outputs)) |
| `tsv` | [TSV - Tab-Separated Values](https://en.wikipedia.org/wiki/Tab-separated_values) |
| `vng` | [VNG - Binary Columnar Format](../formats/vng.md) |
| `csup` | [Super Columnar](../formats/csup.md) |
| `zeek` | [Zeek Logs](https://docs.zeek.org/en/master/logs/index.html) |
| `zjson` | [ZJSON - Zed over JSON](../formats/zjson.md) |
| `bsup` | [Super Binary](../formats/bsup.md) |
Expand Down
2 changes: 1 addition & 1 deletion docs/formats/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ documents are Super JSON values as the Super JSON format is a strict superset of
* [Super Binary](bsup.md) is a row-based, binary representation somewhat like
Avro but leveraging the super data model to represent a sequence of arbitrarily-typed
values.
* [Super Columnar](vng.md) is columnar like Parquet or ORC but also
* [Super Columnar](csup.md) is columnar like Parquet or ORC but also
embodies the super data model for heterogeneous and self-describing schemas.
* [Super JSON over JSON](zjson.md) defines a format for encapsulating Super JSON
inside plain JSON for easy decoding by JSON-based clients, e.g.,
Expand Down
Loading