Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs related to zed/4758 changes #4811

Merged
merged 7 commits into from
Oct 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions cmd/zed/use/command.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,15 @@ import (
var Cmd = &charm.Spec{
Name: "use",
Usage: "use [pool][@branch]",
Short: "use a branch",
Short: "use a branch or print current branch and lake",
Long: `
The use command prints or sets the working pool and branch. Setting these
values allows commands like load, rebase, merge, etc. to function without
having to specify the working branch. The branch specifier may also be
a commit ID, in which case you enter a headless state and commands
like load that require a branch will report an error.

The use command is like "git checkuout" but there is no local copy of
The use command is like "git checkout" but there is no local copy of
the lake data. Rather, the local HEAD state influences commands as
they access the lake.

Expand Down Expand Up @@ -52,7 +52,7 @@ file ~/.zed_head. This file simply contains a pointer to the HEAD branch
and thus provides the default for the -use option. This way, multiple working
directories can contain different HEAD pointers (along with your local files)
and you can easily switch between windows without having to continually
re-specify a new HEAD. Unlike Git, all the commited pool data remains
re-specify a new HEAD. Unlike Git, all the committed pool data remains
in the lake and is not copied to this local directory.
`,
New: New,
Expand Down
37 changes: 27 additions & 10 deletions docs/commands/zed.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ components come together.
While the CLI-first approach provides these benefits,
all of the functionality is also exposed through [an API](../lake/api.md) to
a Zed service. Many use cases involve an application like
[Zui](https://github.com/brimdata/zui) or a
[Zui](https://zui.brimdata.io/) or a
programming environment like Python/Pandas interacting
with the service API in place of direct use with the `zed` command.

Expand Down Expand Up @@ -130,18 +130,17 @@ model onto a file system so that Zed lakes can also be deployed on standard file

The `zed` command provides a single command-line interface to Zed lakes, but
different personalities are taken on by `zed` depending on the particular
sub-command executed and the disposition of its `-lake` option
(which defaults to the value of `ZED_LAKE` environment variable or,
if `ZED_LAKE` is not set, to the client personality `https://localhost:9867`).
sub-command executed and the [lake location](#locating-the-lake).

To this end, `zed` can take on one of three personalities:

* _Direct Access_ - When the lake is a storage path (`file` or `s3` URI),
then the `zed` commands (except for `serve`) all operate directly on the
lake located at that path.
* _Client Personality_ - When the lake is an HTTP or HTTPS URL, then the
lake is presumed to be a Zed lake service endpoint and the client
commands are directed to the service managing the lake.
* _Server Personality_ - When the `zed serve` command is executed, then
* _Server Personality_ - When the [`zed serve`](#serve) command is executed, then
the personality is always the server personality and the lake must be
a storage path. This command initiates a continuous server process
that serves client requests for the lake at the configured storage path.
Expand All @@ -167,6 +166,25 @@ all adhere to the consistency semantics of the Zed lake.
> a server instance of `zed` on the same file system and data consistency will
> be maintained.

### Locating the Lake

At times you may want the Zed CLI tools to access the same lake storage
used by other tools such as [Zui](https://zui.brimdata.io/). To help
enable this by default while allowing for separate lake storage when desired,
`zed` checks each of the following in order to attempt to locate an existing
lake.

1. The contents of the `-lake` option (if specified)
2. The contents of the `ZED_LAKE` environment variable (if defined)
3. A Zed lake service running locally at `http://localhost:9867` (if a socket
is listening at that port)
Comment on lines +179 to +180
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe tighten this?

Suggested change
3. A Zed lake service running locally at `http://localhost:9867` (if a socket
is listening at that port)
3. A Zed lake service running at `http://localhost:9867`

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nwt: I actually kept this one un-tight intentionally. During black box testing I noticed that if I run a simple program that just listens on 9867 and accepts the incoming connection, and the lake location logic makes it to this step, zed effectively hangs indefinitely. I guess it's a hair-split to think that the text as I wrote it communicates this subtlety. But I was trying to express that all it takes is a listening socket for it to commit to this step.

#!/usr/local/bin/python3
import socket
import sys

if (len(sys.argv) != 2 or not sys.argv[1].isdigit()):
  print('Usage: listen <port>')
  exit()

p = int(sys.argv[1])
l = []
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind(('', p))
s.listen(1)
while 1:
  (c, a) = s.accept()
  l.append(c)
  print('%d: connection from %s' % (len(l), a))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@philrz: That's a bug!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nwt: Cool! Glad we discussed. I've just opened #4824 to track this.

4. A `zed` subdirectory below a path in the
[`XDG_DATA_HOME`](https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html)
environment variable (if defined)
Comment on lines +181 to +183
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe tighten this?

Suggested change
4. A `zed` subdirectory below a path in the
[`XDG_DATA_HOME`](https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html)
environment variable (if defined)
4. `$XDG_DATA_HOME/zed` (if the `XDG_DATA_HOME` environment variable is defined)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nwt: This one was also intentionally un-tight because even on Windows the XDG_DATA_HOME environment variable may be obeyed. Obviously users have their choice of shell on Windows, and some use $ to expand. But since the out-of-the-box Command Prompt uses dual % to expand environment variables, I was keeping it higher level. Maybe another futile hair-split, though.

5. A default file system location based on detected OS platform:
- `%LOCALAPPDATA%\zed` on Windows
- `$HOME/.local/share/zed` on Linux and macOS

### Data Pools

A lake is made up of _data pools_, which are like "collections" in NoSQL
Expand Down Expand Up @@ -415,8 +433,8 @@ without confirmation.
zed init [path]
```
A new lake is initialized with the `init` command. The `path` argument
is a [storage path](#storage-layer) and is optional. If not present,
the path is taken from the `ZED_LAKE` environment variable, which must be defined.
is a [storage path](#storage-layer) and is optional. If not present, the path
is [determined automatically](#locating-the-lake).

If the lake already exists, `init` reports an error and does nothing.

Expand Down Expand Up @@ -712,7 +730,7 @@ pool `<existing>`, which may be referenced by its ID or its previous name.
zed serve [options]
```
The `serve` command implements Zed's server personality to service requests
from instances of Zed's client personality.
from instances of Zed's client [personality](#zed-command-personalities).
It listens for Zed lake API requests on the interface and port
specified by the `-l` option, executes the requests, and returns results.

Expand All @@ -727,8 +745,7 @@ is recommended.
zed use [<commitish>]
```
The `use` command sets the working branch to the indicated commitish.
When run without a commitish argument, it displays the current commitish
in use.
When run with no argument, it displays the working branch and [lake](#locating-the-lake).

For example,
```
Expand Down