-
Notifications
You must be signed in to change notification settings - Fork 18
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
228 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,233 @@ | ||
{% | ||
laika.title = "`combinator`" | ||
laika.excludeFromNavigation = true | ||
%} | ||
|
||
# Error Message Combinators | ||
Aside from the failures generated by character consumption, `parsley` has | ||
many combinators for both generating failures unconditionally, as well as | ||
augenting existing errors with more information. These are found within | ||
the `parsley.errors.combinator` module. | ||
|
||
@:callout(info) | ||
*The Scaladoc for this page can be found at [`parsley.errors.combinator`](@:api(parsley.errors.combinator$)).* | ||
@:@ | ||
|
||
## Failure Combinators | ||
Normally, failures can be generated by `empty`, `satisfy`, `string`, and | ||
`notFollowedBy`; as well as their derivatives. However, those do not capture the | ||
full variety of "unexpected" parts of error messages. In the below table, `empty` | ||
corresponds to `empty(0)` (these are both found in `parsley.Parsley`). The | ||
*named* items are produced by `unexpected` combinators, and wider carets of | ||
*empty* items can be obtained by passing wider values to `empty`. This is summarised in the table below. | ||
|
||
| Caret | *empty* | *raw/eof* | *named* | | ||
|-------|------------|-----------|--------------------| | ||
| `0` | `empty(0)` | n/a | `unexpected(0, _)` | | ||
| `1` | `empty(1)` | `satisfy` | `unexpected(1, _)` | | ||
| `n` | `empty(n)` | `string` | `unexpected(n, _)` | | ||
|
||
### The `unexpected` Combinator | ||
The `unexpected` combinator fails immediately, but produces a given name as | ||
the unexpected component of the error message with a caret as wide as the | ||
given integer. For instance: | ||
|
||
```scala mdoc:to-string | ||
import parsley.character.char | ||
import parsley.errors.combinator.unexpected | ||
|
||
unexpected(3, "foo").parse("abcd") | ||
(char('a') | unexpected("not an a")).parse("baa") | ||
``` | ||
|
||
There are a few things to note about the above examples: | ||
|
||
* Just using `unexpected` alone does not introduce any other components, like | ||
expected items, to the error | ||
* When the caret width is unspecified, it will adapt to whatever the | ||
caret would have been for the error message | ||
* The *named* items resulting from the combinator *dominate* other kinds of | ||
item, so that `char('a')`'s natural "unexpected 'a'" disappears | ||
|
||
### The `fail` Combinator | ||
In contrast to the `unexpected` combinator, which produces *vanilla* errors, the | ||
`fail` combinator produces *specialised* errors, which suppress all other | ||
components of an error in favour of some specific messages. | ||
|
||
```scala mdoc:to-string | ||
import parsley.character.string | ||
import parsley.errors.combinator.fail | ||
|
||
fail(2, "msg1", "msg2", "msg3").parse("abc") | ||
(fail(1, "msg1") | fail(2, "msg2") | fail("msg3")).parse("abc") | ||
(fail("msg") | string("abc")).parse("xyz") | ||
(fail(1, "msg") | string("abc")).parse("xyz") | ||
``` | ||
|
||
Notice that if a caret width is specified, it will override any other | ||
carets from other combinators, like `string`. Not specifying a caret | ||
is adaptive. The `fail` combinator also suppressed other error messages, | ||
and merges within itself as if all the messages were generated by one | ||
`fail`. | ||
|
||
## Error Enrichment | ||
Other than the freestanding combinators, some combinators are enabled | ||
by importing `parsley.errors.combinator.ErrorMethods`. Some of these | ||
are involved with augmenting error messages with additional information. | ||
These are discussed below. | ||
|
||
@:callout(info) | ||
None of the combinators in this section have any effect on `fail` or its | ||
derivatives. | ||
@:@ | ||
|
||
### The `label` Combinator | ||
When combinators that read characters fail, they produce "expected" components | ||
in error messages: | ||
|
||
```scala mdoc:to-string | ||
import parsley.character.{char, string, satisfy} | ||
|
||
char('a').parse("b") | ||
string("abc").parse("xyz") | ||
satisfy(_.isDigit).parse("a") | ||
``` | ||
|
||
Notice that the `satisfy` combinator cannot produce an expected item because | ||
nothing is known about the function passed in. The other two produce *raw* | ||
expected items. The `label` combinator can be used to replace these and generate | ||
*named* items. This is employed by `parsley.character` for its more specific | ||
parsers: | ||
|
||
```scala mdoc:to-string | ||
import parsley.errors.combinator.ErrorMethods | ||
|
||
val digit = satisfy(_.isDigit).label("digit") | ||
digit.parse("a") | ||
``` | ||
|
||
The `label` combinator above has added the label `digit` to the parser. If | ||
there was an existing label there, it would have been replaced. | ||
|
||
@:callout(error) | ||
A `label` combinator cannot be provided with `""`. In other libraries, this may | ||
represent hiding, however in `parsley`, the `hide` combinator is distinct. | ||
@:@ | ||
|
||
A `label` combinator, along with other combinators, only applies if the | ||
error message properly lines up with the point the input was at when it | ||
entered the combinator - otherwise, the label may be inaccurate. For example: | ||
|
||
```scala mdoc:to-string | ||
val twoDigits = (digit *> digit).label("two digits") | ||
twoDigits.parse("a") | ||
twoDigits.parse("1a") | ||
``` | ||
|
||
### The `explain` Combinator | ||
The `explain` combinator allows for the addition of further lines of error | ||
message, providing more high-level reasons for the error or explanations about | ||
a syntactic construct. It behaves similarly to `label` in that it will only | ||
apply when the position of the error message matches the offset that the combinator entered at. | ||
|
||
```scala mdoc:to-string | ||
import parsley.errors.combinator.ErrorMethods | ||
|
||
digit.explain("a digit is needed, for some reason").parse("a") | ||
``` | ||
|
||
@:callout(error) | ||
A `explain` combinator cannot be provided with `""`. | ||
@:@ | ||
|
||
### The `hide` Combinator | ||
Sometimes, a parser should not appear in an error message. A good example is | ||
whitespace, which is *almost* never the solution to any parsing problem, and | ||
would otherwise distract from rest of the error content. The `hide` combinator | ||
can be used to suppress a parser from appearing in the rest of a message: | ||
|
||
```scala mdoc:to-string | ||
import parsley.errors.combinator.ErrorMethods | ||
|
||
(char('a') | digit.hide).parse("b") | ||
``` | ||
|
||
## Error Adjustment Combinators | ||
The previous combinators in this page have been geared at adding additional | ||
richer information to the parse errors. However, these combinators are used to | ||
adjust the existing information, mostly relating to position, to ensure the | ||
error remains specific. | ||
|
||
### The `amend` Combinator | ||
The `amend` combinator can adjust the position of an error message so that it | ||
occurs at an earlier position. This means that it can be affected by other | ||
combinators like `label` and `explain`. This is a precision tool, designed | ||
for fine-tuning error messages. | ||
|
||
```scala mdoc:to-string | ||
import parsley.errors.combinator.amend | ||
|
||
amend(digit *> char('a')).parse("9b") | ||
``` | ||
|
||
Notice that the above error makes no sense. This is why `amend` is a precision | ||
tool: it should ideally be used in conjunction with other combinators. For instance: | ||
|
||
```scala mdoc:silent | ||
import parsley.implicits.character.charLift | ||
import parsley.combinator.choice | ||
import parsley.character.{noneOf, stringOfMany} | ||
|
||
val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\') | ||
val strLetter = noneOf('\"', '\\').label("string char") | ('\\' ~> escapeChar).label("escape char") | ||
val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"' | ||
``` | ||
```scala mdoc:to-string | ||
strLit.parse("\"\\b\"") | ||
``` | ||
|
||
In the above error, it is not *entirely* clear why the presented characters | ||
are expected. Perhaps it would be better to highlight a correct escape | ||
character instead? The `amend` combinator can be used in this case to pull | ||
the error back and rectify it: | ||
|
||
```scala mdoc:silent:nest | ||
val strLetter = noneOf('\"', '\\').label("string char") | | ||
amend('\\' ~> escapeChar).label("escape char") | ||
``` | ||
```scala mdoc:invisible | ||
val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"' | ||
``` | ||
```scala mdoc:to-string | ||
strLit.parse("\"\\b\"") | ||
``` | ||
|
||
While the `amend` has pulled the error back, and thanks to the `label` the | ||
error is still sensible, it could be improved by widening the caret and | ||
providing an explanation: | ||
|
||
```scala mdoc:silent:nest | ||
import parsley.Parsley.empty | ||
val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\') | empty(2) | ||
val strLetter = noneOf('\"', '\\').label("string char") | | ||
amend('\\' ~> escapeChar).label("escape char") | ||
.explain("escape characters are \\n, \\t, \\\", or \\\\") | ||
``` | ||
```scala mdoc:invisible | ||
val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"' | ||
``` | ||
```scala mdoc:to-string | ||
strLit.parse("\"\\b\"") | ||
``` | ||
|
||
Note, an `unexpected` could also have been used instead of `empty` to good effect. | ||
|
||
### The `entrench` and `dislodge` Combinators | ||
The `amend` combinator will indiscriminately adjust error messages | ||
so thay they occur earlier. However, sometimes only errors from some | ||
parts of a parser should be repositioned. The `entrench` combinator | ||
protects errors from within its scope from being amended, and | ||
`dislodge` undoes that protection. | ||
|
||
### The `markAsToken` Combinator | ||
The `markAsToken` combinator will assign the "lexical" property to any error messages that happen within its scope at a *deeper* position than the combinator | ||
began at. This is fed forward onto the `unexpectedToken` method of the `ErrorBuilder`: more about this in [lexical extraction][Token Extraction in `ErrorBuilder`]. |