From b80c76208f4b487c822f313c475708438994d45b Mon Sep 17 00:00:00 2001 From: Steven McCanne Date: Sat, 26 Oct 2024 10:30:44 -0700 Subject: [PATCH] address PR feedback --- README.md | 18 +++++++++--------- docs/formats/jsup.md | 2 +- docs/language/pipeline-model.md | 4 ++-- 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 45e73f36a6..b26a378aad 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # SuperDB [![Tests][tests-img]][tests] [![GoPkg][gopkg-img]][gopkg] -SuperDB is a new analytics database that supports relational tables and JSON +SuperDB is a new analytics database that supports relational tables and JSON on an equal footing. It shines when it comes to data wrangling where you need to explore or process large eclectic data sets. It's also pretty decent at analytics and @@ -23,7 +23,7 @@ system for semi-structured data, all data handled by SuperDB (e.g., JSON, CSV, Parquet files, Arrow streams, relational tables, etc) is automatically massaged into [super-structured data](https://zed.brimdata.io/docs/formats/#2-zed-a-super-structured-pattern) form. This super-structured data is then processed by a runtime that simultaneously -supports the statically-typed relational model and the dynamically-typed +supports the statically-typed relational model and the dynamically-typed JSON data model in a unified compute engine. ## SuperSQL @@ -39,7 +39,7 @@ FROM 'https://data.gharchive.org/2015-01-01-15.json.gz' GROUP BY user ORDER BY len(repo) DESC LIMIT 5 |> FORK ( - => FROM f"https://api.github.com/users/${user}" + => FROM f"https://api.github.com/users/${user}" |> SELECT VALUE {user:login,created_at:time(created_at)} => PASS ) @@ -48,10 +48,10 @@ FROM 'https://data.gharchive.org/2015-01-01-15.json.gz' ## Super JSON -Super-structured data is strongly typed and "polymorphic": any value can take on any type +Super-structured data is strongly typed and "polymorphic": any value can take on any type and sequences of data need not all conform to a predefined schema. To this end, SuperDB extends the JSON format to support super-structured data in a format called -[Super JSON](https://zed.brimdata.io/docs/formats/jsup) where all JSON values +[Super JSON](https://zed.brimdata.io/docs/formats/next/jsup) where all JSON values are also Super JSON values. Similarly, the [Super Binary](https://zed.brimdata.io/docs/formats/zng) format is an efficient binary representation of Super JSON (a bit like Avro) and the @@ -78,7 +78,7 @@ using the `super db` sub-commands. ## Piped Query Syntax -The long-term goal for SuperDB's SQL syntax (SuperSQL) is to be Postgres-compatible and interoperate +The long-term goal for SuperDB's SQL syntax (SuperSQL) is to be Postgres-compatible and interoperate with BI tools though this is currently a roadmap item. At the same time, the project seeks to forge new ground on the usability of SQL for data exploration. To this end, SuperSQL supports the @@ -86,15 +86,15 @@ SuperSQL supports the of GoogleSQL, recently described in their [VLDB 2024 paper](https://research.google/pubs/sql-has-problems-we-can-fix-them-pipe-syntax-in-sql/). -In addition to the GoogleSQL syntax, SuperSQL includes additional pipeline -operators to enhance usability, e.g., for search, for traversing +In addition to the GoogleSQL syntax, SuperSQL includes additional pipeline +operators to enhance usability, e.g., for search, for traversing highly nested JSON, for data shaping, etc. To facilitate real-time, data exploration use cases, SuperDB supports an abbreviated form of SuperSQL called [SuperPipe](https://zed.brimdata.io/docs/language). -SuperPipe provides a large number of shortcuts when typing interactive +SuperPipe provides a large number of shortcuts when typing interactive queries, e.g., implied group-by clauses, dropping keywords, implied keyword searches, and so forth. Even though SuperPipe is simply a short-hand form SuperSQL, it sort of looks like the pipeline-style diff --git a/docs/formats/jsup.md b/docs/formats/jsup.md index 5058b49263..9bc28b5baa 100644 --- a/docs/formats/jsup.md +++ b/docs/formats/jsup.md @@ -449,7 +449,7 @@ can be interpreted as a _table_, where record values form the _rows_ and the fields of the records form the _columns_. In this way, these three records form a relational table conforming to the schema `city_schema`. -In contrast, a text representing a semi-structured sequence of log lines +In contrast, text representing a semi-structured sequence of log lines might look like this: ``` { diff --git a/docs/language/pipeline-model.md b/docs/language/pipeline-model.md index 05ec24fda0..96dcdcc362 100644 --- a/docs/language/pipeline-model.md +++ b/docs/language/pipeline-model.md @@ -107,7 +107,7 @@ a [`merge` operator](operators/merge.md) may be applied at the output of the switch specifying a sort key upon which to order the upstream data. Often such order does not matter (e.g., when the output of the switch hits an [aggregator](aggregates/README.md)), in which case it is typically more performant -to omit the merge (though the super runtime will often delete such unnecessary +to omit the merge (though the SuperDB runtime will often delete such unnecessary operations automatically as part optimizing queries when they are compiled). If no `merge` or `join` is indicated downstream of a `fork` or `switch`, @@ -202,7 +202,7 @@ in later expressions. ## Implied Operators -When SuperPipe is utilized in an application like [SuperDB desktop](https://zui.brimdata.io), +When SuperPipe is utilized in an application like [SuperDB Desktop](https://zui.brimdata.io), queries are often composed interactively in a "search bar" experience. The language design here attempts to support both this "lean forward" pattern of usage along with a "coding style" of query writing where the queries might be large