Skip to content
This repository has been archived by the owner on Nov 18, 2021. It is now read-only.

Commit

Permalink
doc/ref: spec changes to comprehensions and identifiers
Browse files Browse the repository at this point in the history
This CL has two somewhat proposals, although they can
be split:

1) Make fields with string labels non-referrable and introduce
allowing backtick for identifiers.
So
  a.`for`
would be a valid selector.
This simplifies the spec a bit as there is no explanation needed why interpolated strings cannot be referenced.
Using `` has precedence in BCL and swift.

It solves the issue of code generation, where it can be hard to
track referred values and it allows referencing keywords and other fields alike.

2) Now we have embedding there is a nicer way to write
comprehensions.  The proposed change solves many
issues.

Change-Id: I8ba25bd3a6b9a9d790dcecbd3b5954a969440396
Reviewed-on: https://cue-review.googlesource.com/c/cue/+/2950
Reviewed-by: Marcel van Lohuizen <[email protected]>
  • Loading branch information
mpvl committed Sep 9, 2019
1 parent 061bde1 commit 4017875
Showing 1 changed file with 86 additions and 62 deletions.
148 changes: 86 additions & 62 deletions doc/ref/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,9 +179,16 @@ these rules.
### Identifiers

Identifiers name entities such as fields and aliases.
An identifier is a sequence of one or more letters and digits.
Identifier may be simple or quoted.
A simple identifier is a sequence of one or more letters (which includes `_`) and digits.
It may not be `_`.
The first character in an identifier must be a letter.
Any sequence of letters, digits or `-` enclosed in
backticks "`" make an identifier.
The backticks are not part of the identifier.
This allows one to refer to fields that are labeled
with keywords or other identifiers that would
otherwise not be legal.

<!--
TODO: allow identifiers as defined in Unicode UAX #31
Expand All @@ -191,8 +198,11 @@ Identifiers are normalized using the NFC normal form.
-->

```
identifier = letter { letter | unicode_digit } .
identifier = simple_identifier | quoted_identifier .
simple_identifier = letter { letter | unicode_digit } .
quoted_identifier = "`" { letter | unicode_digit | "-" } "`" .
```
<!-- TODO: relax to allow other punctuation -->

```
a
Expand All @@ -209,6 +219,10 @@ Some identifiers are [predeclared](#predeclared-identifiers).
### Keywords

CUE has a limited set of keywords.
In addition, CUE reserves all identifiers starting with `__`(double underscores)
as keywords.
These are typically targets of pre-declared identifiers.

All keywords may be used as labels (field names).
They cannot, however, be used as identifiers to refer to the same name.

Expand Down Expand Up @@ -244,7 +258,7 @@ for in if let
```

The keywords `for`, `if` and `let` cannot be used as identifiers to
refer to fields. All others can.
refer to fields.

<!--
TODO:
Expand All @@ -269,13 +283,22 @@ These may be used as identifiers to refer to fields in all other contexts.
The following character sequences represent operators and punctuation:

```
+ div && == < . ( )
- mod || != > : { }
* quo & =~ <= = [ ]
/ rem | !~ >= <- ... ,
_|_ ! ;
+ div && == < = ( )
- mod || != > :: { }
* quo & =~ <= : [ ]
/ rem | !~ >= . ... ,
_|_ !
```
<!-- :: for "is-a" definitions -->
<!--
Free tokens: # ; ~ $ ^
// To be used:
@ at: associative lists.
// Idea: use # instead of @ for attributes and allow then at declaration level.
// This will open up the possibility of defining #! at the start of a file
// without requiring special syntax. Although probably not quite.
-->


### Integer literals
Expand Down Expand Up @@ -1029,6 +1052,8 @@ question mark `?`.
The question mark is not part of the field name.
Concrete field labels may be an identifier or string, the latter of which may be
interpolated.
Fields with identifier labels can be referred to within the scope they are
defined, string labels cannot.
References within such interpolated strings are resolved within
the scope of the struct in which the label sequence is
defined and can reference concrete labels lexically preceding
Expand Down Expand Up @@ -1151,7 +1176,7 @@ AliasDecl = Label "=" Expression .
BindLabel = "<" identifier ">" .
ConcreteLabel = identifier | simple_string_lit .
ExpressionLabel = BindLabel
Label = ConcreteLabel [ "?" ] | ExpressionLabel "?".
Label = ConcreteLabel [ "?" ] | ExpressionLabel .
<!-- (jba) According to this grammar, I must write a "?" after a bind label, so
"<Name>: name" is illegal.
Expand Down Expand Up @@ -1214,15 +1239,10 @@ A1: A & {

A _closed struct_ `c` is a struct whose instances may not have regular fields
not defined in `c`.

Closing a struct is equivalent to adding an optional field with value `_|_`
For the purpose of unification,
closing a struct is equivalent to adding an optional field with value `_|_`
for all undefined fields.

Note that fields created with field comprehensions are not considered
defined fields.
Fields inserted by a field comprehension defined in a closed struct
are only permitted when defined explicitly by a required or optional field.

Syntactically, closed structs can be explicitly created with the `close` builtin
or implicitly by [definitions](#Definitions).

Expand All @@ -1234,23 +1254,30 @@ A: close({
})
A1: A & {
feild1: string // _|_ feild1 not defined for A
}
feild1: string
} // _|_ feild1 not defined for A
A2: A & {
k: v for k,v in { feild1: string } // _|_ feild1 not defined for A
}
for k,v in { feild1: string } {
k: v
}
} // _|_ feild1 not defined for A
C: close({
<_>: _
})
C2: C & {
"\(k)": v for k,v in { thisIsFine: string }
for k,v in { thisIsFine: string } {
"\(k)": v
}
}
D: close({
"\(k)": v for k,v in { x: string } // _|_ field "x" not defined
// Values generated by comprehensions are treated as embeddings.
for k,v in { x: string } {
"\(k)": v
}
})
```

Expand Down Expand Up @@ -1584,13 +1611,14 @@ Blocks nest and influence [scoping].

### Declarations and scope

A _declaration_ binds an identifier to a field, alias, or package.
A _declaration_ may bind an identifier to a field, alias, or package.
Every identifier in a program must be declared.
Other than for fields,
no identifier may be declared twice within the same block.
For fields an identifier may be declared more than once within the same block,
resulting in a field with a value that is the result of unifying the values
of all fields with the same identifier.
String labels do not bind an identifier to the respective field.

```
TopLevelDecl = Declaration | Emit .
Expand Down Expand Up @@ -1622,6 +1650,10 @@ and to specify the default name for import declarations.

### Predeclared identifiers

CUE predefines a set of types and builtin functions.
For each of these there is a corresponding keyword which is the name
of the predefined identifier, prefixed with `__`.

```
Functions
len required close open
Expand Down Expand Up @@ -1660,23 +1692,11 @@ float64 >=-1.797693134862315708145274237317043567981e+308 &

An identifier of a package may be exported to permit access to it
from another package.
<!-- TODO: remove hidden fields by replacing the follwing with this text.
An identifier is exported if
the first character of the identifier's name is a Unicode upper case letter
(Unicode class "Lu"); and
the identifier is declared in the file block.
All other top-level identifiers used for fields not exported.
-->
An identifier is exported if both:
the first character of the identifier's name is not a Unicode lower case letter
(Unicode class "Ll") or the underscore "_"; and
the identifier is declared in the file block.
All other identifiers are not exported.

An identifier that starts with the underscore "_" is not
emitted in any data output and treated as a definition for that purpose.
Quoted labels that start with an underscore are emitted, however.
<!-- END REPLACE -->

In addition, any definition declared anywhere within a package of which
the first character of the identifier's name is a Unicode upper case letter
Expand Down Expand Up @@ -1715,15 +1735,20 @@ Otherwise, they are the same.

### Field declarations

A field declaration binds a label (the name of the field) to an expression.
The name for a quoted string used as label is the string it represents.
Tne name for an identifier used as a label is the identifier itself.
<!-- TODO: replace the remainder of this paragraph with the following
Quoted strings and identifiers can be used used interchangeably.
-->
Quoted strings and identifiers can be used used interchangeably, with the
exception of identifiers starting with an underscore '_'.
The latter represent hidden fields and are treated in a different namespace.
A field associates the value of an expression to a label within a struct.
If this label is an identifier, it binds the field to that identifier,
so the field's value can be referenced by writing the identifier.
String labels are not bound to fields.
```
a: {
b: 2
"s": 3
c: b // 2
d: s // _|_ unresolved identifier "s"
e: a.s // 3
}
```

If an expression may result in a value associated with a default value
as described in [default values](#default-values), the field binds to this
Expand All @@ -1749,7 +1774,7 @@ An alias declaration binds an identifier to the given expression.

Within the scope of the identifier, it serves as an _alias_ for that
expression.
The expression is evaluated in the scope as it was declared.
The expression is evaluated in the scope it was declared.


## Expressions
Expand Down Expand Up @@ -2562,22 +2587,15 @@ sequence is reached.
_List comprehensions_ specify a single expression that is evaluated and included
in the list for each completed iteration.

_Field comprehensions_ follow a `Field` with a clause sequence, where the
label and value of the field are evaluated for each iteration.
The label must be an identifier or simple_string_lit, where the
later may be a string interpolation that refers to the identifiers defined
in the clauses.
Values of iterations that map to the same label unify into a single field.
_Field comprehensions_ follow a clause sequence with a struct literal,
where the struct literal is evaluated and embedded at the point of
declaration of the comprehension for each complete iteration.
As usual, fields in the struct may evaluate to the same label,
resulting in the unification of their values.

<!--
TODO: consider allowing multiple labels for comprehensions
(current implementation). Generally it is better to define comprehensions
in the current scope, though, as it may prevent surprises given the
restrictions on comprehensions.
-->
```
ComprehensionDecl = Label ":" Expression [ "<-" ] Clauses .
ListComprehension = "[" Expression [ "<-" ] Clauses "]" .
ComprehensionDecl = Clauses StructLit .
ListComprehension = "[" Expression Clauses "]" .
Clauses = Clause { Clause } .
Clause = ForClause | GuardClause | LetClause .
Expand All @@ -2590,7 +2608,13 @@ LetClause = "let" identifier "=" Expression .
a: [1, 2, 3, 4]
b: [ x+1 for x in a if x > 1] // [3, 4, 5]
c: { "\(x)": x + y for x in a if x < 4 let y = 1 }
c: {
for x in a
if x < 4
let y = 1 {
"\(x)": x + y
}
}
d: { "1": 2, "2": 3, "3": 4 }
```

Expand Down Expand Up @@ -2904,7 +2928,7 @@ The import names an identifier (PackageName) to be used for access and an
ImportPath that specifies the package to be imported.

```
ImportDecl = "import" ( ImportSpec | "(" { ImportSpec ";" } ")" ) .
ImportDecl = "import" ( ImportSpec | "(" { ImportSpec "," } ")" ) .
ImportSpec = [ PackageName ] ImportPath .
ImportLocation = { unicode_value } .
ImportPath = `"` ImportLocation [ ":" identifier ] `"` .
Expand Down

0 comments on commit 4017875

Please sign in to comment.