From ab89d90b0dba0ab45b2c01215106cdbd8e795331 Mon Sep 17 00:00:00 2001 From: Phil Rzewski Date: Mon, 14 Oct 2024 14:51:12 -0700 Subject: [PATCH] wip --- docs/README.md | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/docs/README.md b/docs/README.md index f7bc009764..d3fe29ebea 100644 --- a/docs/README.md +++ b/docs/README.md @@ -15,13 +15,12 @@ without giving up JSON's uncanny ability to represent eclectic data. Trying out SuperDB is easy: just [install](install.md) the command-line tool [`super`](commands/zq.md) and run through the [tutorial](tutorials/zq.md). -`super` is a lot like [`jq`](https://stedolan.github.io/jq/) -but is built from the ground up as a search and analytics engine based -on the [super-structured data model](formats/zed.md). Since super-structured data is a -proper superset of JSON, `super` also works natively with JSON. - -While `super` and its accompanying data formats are production quality, the project's -[SuperDB data lake](commands/zed.md) is a bit [earlier in development](commands/zed.md#status). +Compared to putting JSON data in a relational column, the +[super-structured data model](formats/zed.md) makes it really easy to +mash up JSON with your relational tables. The `super` command is a little +like [DuckDB](https://duckdb.org/) and a little like +[`jq`](https://stedolan.github.io/jq/) but super-structured data ties the +two patterns together with strong typing of dynamic values. For a non-technical user, SuperDB is as easy to use as web search while for a technical user, SuperDB exposes its technical underpinnings @@ -30,6 +29,9 @@ packaged up in the easy-to-understand [Super JSON data format](formats/zson.md) and [SuperPipe language](language/README.md). +While `super` and its accompanying data formats are production quality, the project's +[SuperDB data lake](commands/zed.md) is a bit [earlier in development](commands/zed.md#status). + ## Terminology "Super" is an umbrella term that describes @@ -37,12 +39,10 @@ a number of different elements of the system: * The [super-structured data model](formats/zed.md) is the abstract definition of the data types and semantics that underlie the super-structured data formats. * The [super-structured data formats](formats/README.md) are a family of -[sequential (Super Buffers, SBUF)](formats/zng.md), [columnar (Super Parquet, SPAR)](formats/vng.md), -and [human-readable (Super JSON, SUP)](formats/zson.md) formats that all adhere to the +[human-readable (Super JSON, SUP)](formats/zson.md), +[sequential (Binary Super JSON, SUPZ)](formats/zng.md), and +[columnar (Super Parquet, SPAR)](formats/vng.md), formats that all adhere to the same abstract super-structured data model. -* A [SuperDB data lake](commands/zed.md) is a collection of super-structured data stored -across one or more [data pools](commands/zed.md#data-pools) with ACID commit semantics and -accessed via a [Git](https://git-scm.com/)-like API. * The [SuperPipe language](language/README.md) is the system's pipeline language for performing queries, searches, analytics, transformations, or any of the above combined together. * A [SuperPipe query](language/overview.md) is a script that performs @@ -52,6 +52,9 @@ data transformation to _shape_ the input data into the desired set of organizing super-structured data types called "shapes", which are traditionally called _schemas_ in relational systems but are much more flexible in SuperDB. +* A [SuperDB data lake](commands/zed.md) is a collection of super-structured data stored +across one or more [data pools](commands/zed.md#data-pools) with ACID commit semantics and +accessed via a [Git](https://git-scm.com/)-like API. ## Digging Deeper @@ -92,7 +95,7 @@ at the copy of the lake. Functionality like [data compaction](commands/zed.md#manage) and retention are all API-driven. -Bite-sized components are unified by the super-structured data, usually in the SBUF format: +Bite-sized components are unified by the super-structured data, usually in the SUPZ format: * All lake meta-data is available via meta-queries. * All lake operations available through the service API are also available directly via the `super db` command.