Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds glossary and consolidates grammar #359

Merged
merged 7 commits into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion _books/ion-1-1/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
- [Shared modules](modules/shared_modules.md)
- [Inner modules](modules/inner_modules.md)
- [The system module](modules/system_module.md)
- [Grammar](modules/grammar.md)
- [Binary encoding](binary/encoding.md)
- [Encoding primitives](binary/primitives.md)
- [`FlexUInt`](binary/primitives/flex_uint.md)
Expand All @@ -39,6 +38,8 @@
- [E-expressions](binary/e_expressions.md)
- [Annotations](binary/annotations.md)
- [NOP](binary/nop.md)
- [Grammar](grammar.md)
- [Glossary](glossary.md)
<!--
The todo.md page is a placeholder target for links we haven't populated yet.
Only pages that are listed in `SUMMARY.md` will be shown to users; todo.md
Expand Down
159 changes: 159 additions & 0 deletions _books/ion-1-1/src/glossary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# Glossary

**active encoding module**<br/>
The _encoding module_ whose symbol table and macro table are available in the current _segment_ of an Ion _document_.
The active encoding module is set by a _directive_.
Comment on lines +3 to +5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud, it would be really nice to incorporate something like this pre-processor to make phrases in the spec show their definition on hover.

Having a glossary is great, but not having to leave the page to define a term would be even better.

EDIT: I just saw that preprocessor's first and only issue. 😆


**argument**<br/>
The sub-expression(s) within a macro invocation, corresponding to exactly one of the macro's parameters.

**cardinality**<br/>
Describes the number of values that a parameter will accept when the macro is invoked.
One of zero-or-one, exactly-one, zero-or-more, or one-or-more.
Specified in a signature by one of the modifiers `?`, `!`, `*`, or `+`.
popematt marked this conversation as resolved.
Show resolved Hide resolved

**declaration**<br/>
The association of a name with an entity (for example, a module or macro). See also _definition_.
Not all declarations are definitions: some introduce new names for existing entities.

**definition**<br/>
The specification of a new entity.

**directive**<br/>
A keyword or unit of data in an Ion document that affects the encoding environment, and thus the way the document's data is decoded.
popematt marked this conversation as resolved.
Show resolved Hide resolved
In Ion 1.0 there are two directives: _Ion version markers_, and the _symbol table directives_.
Ion 1.1 adds _encoding directives_.

**document**<br/>
A stream of octets conforming to either the Ion text or binary specification.
Can consist of multiple _segments_, perhaps using varying versions of the Ion specification.
A document does not necessarily exist as a file, and is not necessarily finite.

**E-expression**<br/>
See _encoding expression_.

**encoding directive**<br/>
In an Ion 1.1 segment, a top-level S-Expression annotated with `$ion_encoding`.
Defines a new encoding module for the segment immediately following it.
At the end of the encoding directive, the new _encoding module_ is promoted to be the _active encoding module_.
The _symbol table directive_ is effectively a less capable alternative syntax.

**encoding environment**<br/>
The context-specific data maintained by an Ion implementation while encoding or decoding data. In
Ion 1.0 this consists of the current symbol table; in Ion 1.1 this is expanded to also include the Ion
spec version, the current macro table, and a collection of available modules.

**encoding expression**<br/>
The invocation of a macro in encoded data, aka E-expression.
Starts with a macro reference denoting the function to invoke.
The Ion text format uses "smile syntax" `(:macro ...)` to denote E-expressions.
Ion binary devotes a large number of opcodes to E-expressions, so they can be compact.

**encoding module**<br/>
A _module_ whose symbol table and macro table can be used directly in the user data stream.

**expression**<br/>
A serialized syntax element that may produce values.
_Encoding expressions_ and values are both considered expressions, whereas NOP, comments, and IVMs, for example, are not.

**expression group**<br/>
A grouping of zero or more _expressions_ that together form one _argument_.
The concrete syntax for passing a stream of expressions to a macro parameter.
In a text _E-expression_, a group starts with the trigraph `(::` and ends with `)`, similar to an S-expression.
In _template definition language_, a group is written as an S-expression starting with `..` (two dots).

**inner module**<br/>
A _module_ that is defined inside another module and only visible inside the definition of that module.

**Ion version marker**<br/>
A keyword directive that denotes the start of a new segment encoded with a specific Ion version.
Also known as "IVM".

**macro**<br/>
A transformation function that accepts some number of streams of values, and produces a stream of values.

**macro definition**<br/>
Specifies a macro in terms of a _signature_ and a _template_.

**macro reference**<br/>
Identifies a macro for invocation or exporting. Must always be unambiguous. Lexically
scoped, and never a "forward reference" to a macro that is declared later in the document.
popematt marked this conversation as resolved.
Show resolved Hide resolved

**module**<br/>
The data entity that defines and exports both symbols and macros.

**opcode**<br/>
A 1-byte, unsigned integer that tells the reader what the next expression represents
and how the bytes that follow should be interpreted.

**optional parameter**<br/>
A parameter that can have its corresponding subform(s) omitted when the macro is invoked.
A parameter is optional if it is _voidable_ and all following arguments are also voidable.
popematt marked this conversation as resolved.
Show resolved Hide resolved

**parameter**<br/>
A named input to a macro, as defined by its signature.
At expansion time a parameter produces a stream of values.

**qualified macro reference**<br/>
A macro reference that consists of a module name and either a macro name exported by that module,
or a numeric address within the range of the module's exported macro table. In TDL, these look
like _module-name_::_name-or-address_.

**required parameter**<br/>
A macro parameter that is not _optional_ and therefore requires an argument at each invocation.

**rest parameter**<br/>
A macro parameter—always the final parameter—declared with `*` or `+` cardinality,
that accepts all remaining individual arguments to the macro as if they were in an implicit _argument group_.
popematt marked this conversation as resolved.
Show resolved Hide resolved
Similar to "varargs" parameters in Java and other languages.

**segment**<br/>
A contiguous partition of a _document_ that uses the same _active encoding module_. Segment boundaries
are caused by directives: an IVM starts a new segment, while `$ion_symbol_table` and `$ion_encoding`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An IVM ends the previous segment (if any) and starts a new one, correct?

directives end segments (with a new one starting immediately afterward).

**shared module**<br/>
A module that exists independent of the data stream of an Ion document. It is identified by a
name and version so that it can be imported by other modules.

**signature**<br/>
The part of a macro definition that specifies its "calling convention", in terms of the shape,
type, and cardinality of arguments it accepts, and the type and cardinality of the results it
popematt marked this conversation as resolved.
Show resolved Hide resolved
produces.

**symbol table directive**<br/>
A top-level struct annotated with `$ion_symbol_table`. Defines a new encoding environment
without any macros. Valid in Ion 1.0 and 1.1.

**system E-Expression**<br/>
popematt marked this conversation as resolved.
Show resolved Hide resolved
An _E-Expression_ that invokes a _macro_ from the _system-module_ rather than from the _active encoding module_.

**system macro**<br/>
A macro provided by the Ion implementation via the system module `$ion`.
System macros are available at all points within Ion 1.1 segments.

**system module**<br/>
A standard module named `$ion` that is provided by the Ion implementation, implicitly installed so
that the system symbols and system macros are available at all points within a document.
Subsumes the functionality of the Ion 1.0 system symbol table.

**system symbol**<br/>
A symbol provided by the Ion implementation via the system module `$ion`.
System symbols are available at all points within an Ion document, though the selection of symbols
varies by segment according to its Ion version.

**TDL**<br/>
See _template definition language_.

**template**<br/>
The part of a macro definition that expresses its transformation of inputs to results.

**template definition language**<br/>
An Ion-based, domain-specific language that declaratively specifies the output produced by a _macro_.
popematt marked this conversation as resolved.
Show resolved Hide resolved

**unqualified macro reference**<br/>
A macro reference that consists of either a macro name or numeric address, without a qualifying module name.
These are resolved using lexical scope and must always be unambiguous.

**variable expansion**<br/>
In _TDL_, a special form that causes the expanded _arguments_ for the given _parameter_ to be substituted into the _template_.
popematt marked this conversation as resolved.
Show resolved Hide resolved
114 changes: 114 additions & 0 deletions _books/ion-1-1/src/grammar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Grammar

This chapter presents Ion 1.1's _domain grammar_, by which we mean the grammar of the domain
of values that drive Ion's encoding features.

We use a BNF-like notation for describing various syntactic parts of a document, including Ion data structures.
In such cases, the BNF should be interpreted loosely to accommodate Ion-isms like commas and unconstrained ordering of struct fields.

### Documents
```bnf
document ::= ivm? segment*

ivm ::= '$ion_1_0' | '$ion_1_1'

segment ::= value* directive?

directive ::= ivm
| encoding-directive
| symtab-directive

symtab-directive ::= local-symbol-table ; As per the Ion 1.0 specification¹

encoding-directive ::= '$ion_encoding::(' module-body ')'
```

&nbsp;&nbsp;&nbsp;&nbsp;¹[Symbols – Local Symbol Tables](https://amazon-ion.github.io/ion-docs/docs/symbols.html#local-symbol-tables).

### Modules
```bnf
module-body ::= import* inner-module* symbol-table? macro-table?

shared-module ::= '$ion_shared_module::' ivm '::(' catalog-key module-body ')'

import ::= '(import ' module-name catalog-key ')'

catalog-key ::= catalog-name catalog-version?

catalog-name ::= string

catalog-version ::= unannotated-uint ; must be positive

inner-module ::= '(module' module-name module-body ')'

module-name ::= unannotated-identifier-symbol

symbol-table ::= '(symbol_table' symbol-table-entry* ')'

symbol-table-entry ::= module-name | symbol-list

symbol-list ::= '[' symbol-text* ']'

symbol-text ::= symbol | string

macro-table ::= '(macro_table' macro-table-entry* ')'

macro-table-entry ::= macro-definition
| macro-export
| module-name

macro-export ::= '(export' qualified-macro-ref macro-name-declaration? ')'
```
### Macro references
```bnf
qualified-macro-ref ::= module-name '::' macro-ref

macro-ref ::= macro-name | macro-addr

qualified-macro-name ::= module-name '::' macro-name

macro-name ::= unannotated-identifier-symbol

macro-addr ::= unannotated-uint
```

### Macro definitions
```bnf
macro-definition ::= '(macro' macro-name-declaration signature tdl-expression ')'

macro-name-declaration ::= macro-name | 'null'

signature ::= '(' parameter* ')'

parameter ::= parameter-encoding? parameter-name parameter-cardinality?

parameter-encoding ::= (primitive-encoding-type | macro-name | qualified-macro-name)'::'

primitive-encoding-type ::= 'uint8' | 'uint16' | 'uint32' | 'uint64'
| 'int8' | 'int16' | 'int32' | 'int64'
| 'float16' | 'float32' | 'float64'
| 'flex_int' | 'flex_uint'
| 'flex_sym' | 'flex_string'

parameter-name ::= unannotated-identifier-symbol

parameter-cardinality ::= '!' | '*' | '?' | '+'

tdl-expression ::= operation | variable-expansion | ion-scalar | ion-container

operation ::= macro-invocation | special-form

variable-expansion ::= '(%' variable-name ')'

variable-name ::= unannotated-identifier-symbol

macro-invocation ::= '(.' macro-ref macro-arg* ')'

special-form ::= '(.' '$ion::'? special-form-name tdl-expression* ')'

special-form-name ::= 'for' | 'if_none' | 'if_some' | 'if_single' | 'if_multi'

macro-arg ::= tdl-expression | expression-group

expression-group ::= '(..' tdl-expression* ')'
```
41 changes: 6 additions & 35 deletions _books/ion-1-1/src/macros/defining_macros.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ Syntactically, the signature is an s-expression of [parameter declarations](#mac

### Template definition language (TDL)

The macro's _template_ is a single Ion value that defines how a reader should expand invovations of the macro.
The macro's _template_ is a single Ion value that defines how a reader should expand invocations of the macro.
Ion 1.1 introduces a template definition language (TDL) to express this process in terms of the macro's parameters.
TDL is a small language with only a few constructs.

Expand Down Expand Up @@ -209,23 +209,24 @@ $ion_encoding::(
#### Macro invocations

Macro invocations call an existing macro.
The invoked macro could be a [system macro](system_macros.md), a macro imported from a [shared module](../todo.md), or a macro previously defined in the current scope.
The invoked macro could be a [system macro](system_macros.md), a macro imported from a
[shared module](../modules/shared_modules.md), or a macro previously defined in the current scope.

Syntactically, a macro invocation is an s-expression whose first value is the operator `.` and whose second value is a macro reference.

##### Grammar
```bnf
macro-invocation ::= '(.' macro-ref macro-arg* ')',
macro-invocation ::= '(.' macro-ref macro-arg* ')'

macro-ref ::= (module-name '::')? (macro-name | macro-address)

macro-arg ::= expression | arg-group
macro-arg ::= expression | expression-group

macro-name ::= ion-identifier

macro-address ::= unsigned-ion-integer

arg-group ::= '(::' expression* ')'
expression-group ::= '(..' expression* ')'
```

##### Invocation syntax illustration
Expand Down Expand Up @@ -393,33 +394,3 @@ Special forms are similar to macro invocations, but they have their own expansio
See [_Special forms_](special_forms.md) for the list of special forms and a description of each.

Note that unlike macro expansions, special forms cannot accept argument groups.

#### TDL Grammar
```bnf
expression ::= ion-scalar | ion-ql-container | operation | variable-expansion

ion-scalar ::= ; <Any Ion scalar value>

ion-ql-container ::= ; <An Ion container quasi-literal>

operation ::= macro-invocation | special-form

variable-expansion ::= '(%' variable-name ')'

variable-name ::= ion-identifier

macro-invocation ::= '(.' macro-ref macro-arg* ')'

special-form ::= '(.' ('$ion::')? special-form-name expression* ')'

macro-ref ::= (module-name '::')? (macro-name | macro-address)

macro-arg ::= expression | arg-group

macro-name ::= ion-identifier

macro-address ::= ion-unsigned-integer

arg-group ::= '(::' expression* ')'
```

15 changes: 2 additions & 13 deletions _books/ion-1-1/src/modules.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,16 +94,5 @@ $ion_encoding::(

Many of the grammatical elements used to define modules and macros are _identifiers_--symbols that do not require quotation marks.

More explicitly, an identifier is a sequence of one or more ASCII letters, digits, or the characters `$` (dollar sign) or `_` (underscore), not starting with a digit. It also cannot be of the form `$\d+`, which is the syntax for symbol IDs. (For example: `$3`, `$10`, `$458`, etc.)

```bnf
identifier ::= identifier-start identifier-char*

identifier-start ::= letter
| '_'
| '$' letter
| '$_'
| '$$'

identifier-char ::= letter | digit | '$' | '_'
```
More explicitly, an identifier is a sequence of one or more ASCII letters, digits, or the characters `$` (dollar sign) or `_` (underscore), not starting with a digit.
It also cannot be of the form `$\d+`, which is the syntax for symbol IDs (for example: `$3`, `$10`, `$458`, etc.), nor can it be a keyword (`true`, `false`, `null`, or `nan`).
Loading
Loading