From f5de6204f1be8adfa3f8813e25c6b4170b31eed3 Mon Sep 17 00:00:00 2001 From: Jamie Willis Date: Sun, 24 Dec 2023 17:01:44 +0000 Subject: [PATCH] error combinator page done --- docs/api-guide/errors/combinator.md | 229 +++++++++++++++++++++++++++- 1 file changed, 228 insertions(+), 1 deletion(-) diff --git a/docs/api-guide/errors/combinator.md b/docs/api-guide/errors/combinator.md index 55a53e561..9135e839a 100644 --- a/docs/api-guide/errors/combinator.md +++ b/docs/api-guide/errors/combinator.md @@ -1,6 +1,233 @@ {% laika.title = "`combinator`" -laika.excludeFromNavigation = true %} # Error Message Combinators +Aside from the failures generated by character consumption, `parsley` has +many combinators for both generating failures unconditionally, as well as +augenting existing errors with more information. These are found within +the `parsley.errors.combinator` module. + +@:callout(info) +*The Scaladoc for this page can be found at [`parsley.errors.combinator`](@:api(parsley.errors.combinator$)).* +@:@ + +## Failure Combinators +Normally, failures can be generated by `empty`, `satisfy`, `string`, and +`notFollowedBy`; as well as their derivatives. However, those do not capture the +full variety of "unexpected" parts of error messages. In the below table, `empty` +corresponds to `empty(0)` (these are both found in `parsley.Parsley`). The +*named* items are produced by `unexpected` combinators, and wider carets of +*empty* items can be obtained by passing wider values to `empty`. This is summarised in the table below. + +| Caret | *empty* | *raw/eof* | *named* | +|-------|------------|-----------|--------------------| +| `0` | `empty(0)` | n/a | `unexpected(0, _)` | +| `1` | `empty(1)` | `satisfy` | `unexpected(1, _)` | +| `n` | `empty(n)` | `string` | `unexpected(n, _)` | + +### The `unexpected` Combinator +The `unexpected` combinator fails immediately, but produces a given name as +the unexpected component of the error message with a caret as wide as the +given integer. For instance: + +```scala mdoc:to-string +import parsley.character.char +import parsley.errors.combinator.unexpected + +unexpected(3, "foo").parse("abcd") +(char('a') | unexpected("not an a")).parse("baa") +``` + +There are a few things to note about the above examples: + +* Just using `unexpected` alone does not introduce any other components, like + expected items, to the error +* When the caret width is unspecified, it will adapt to whatever the + caret would have been for the error message +* The *named* items resulting from the combinator *dominate* other kinds of + item, so that `char('a')`'s natural "unexpected 'a'" disappears + +### The `fail` Combinator +In contrast to the `unexpected` combinator, which produces *vanilla* errors, the +`fail` combinator produces *specialised* errors, which suppress all other +components of an error in favour of some specific messages. + +```scala mdoc:to-string +import parsley.character.string +import parsley.errors.combinator.fail + +fail(2, "msg1", "msg2", "msg3").parse("abc") +(fail(1, "msg1") | fail(2, "msg2") | fail("msg3")).parse("abc") +(fail("msg") | string("abc")).parse("xyz") +(fail(1, "msg") | string("abc")).parse("xyz") +``` + +Notice that if a caret width is specified, it will override any other +carets from other combinators, like `string`. Not specifying a caret +is adaptive. The `fail` combinator also suppressed other error messages, +and merges within itself as if all the messages were generated by one +`fail`. + +## Error Enrichment +Other than the freestanding combinators, some combinators are enabled +by importing `parsley.errors.combinator.ErrorMethods`. Some of these +are involved with augmenting error messages with additional information. +These are discussed below. + +@:callout(info) +None of the combinators in this section have any effect on `fail` or its +derivatives. +@:@ + +### The `label` Combinator +When combinators that read characters fail, they produce "expected" components +in error messages: + +```scala mdoc:to-string +import parsley.character.{char, string, satisfy} + +char('a').parse("b") +string("abc").parse("xyz") +satisfy(_.isDigit).parse("a") +``` + +Notice that the `satisfy` combinator cannot produce an expected item because +nothing is known about the function passed in. The other two produce *raw* +expected items. The `label` combinator can be used to replace these and generate +*named* items. This is employed by `parsley.character` for its more specific +parsers: + +```scala mdoc:to-string +import parsley.errors.combinator.ErrorMethods + +val digit = satisfy(_.isDigit).label("digit") +digit.parse("a") +``` + +The `label` combinator above has added the label `digit` to the parser. If +there was an existing label there, it would have been replaced. + +@:callout(error) +A `label` combinator cannot be provided with `""`. In other libraries, this may +represent hiding, however in `parsley`, the `hide` combinator is distinct. +@:@ + +A `label` combinator, along with other combinators, only applies if the +error message properly lines up with the point the input was at when it +entered the combinator - otherwise, the label may be inaccurate. For example: + +```scala mdoc:to-string +val twoDigits = (digit *> digit).label("two digits") +twoDigits.parse("a") +twoDigits.parse("1a") +``` + +### The `explain` Combinator +The `explain` combinator allows for the addition of further lines of error +message, providing more high-level reasons for the error or explanations about +a syntactic construct. It behaves similarly to `label` in that it will only +apply when the position of the error message matches the offset that the combinator entered at. + +```scala mdoc:to-string +import parsley.errors.combinator.ErrorMethods + +digit.explain("a digit is needed, for some reason").parse("a") +``` + +@:callout(error) +A `explain` combinator cannot be provided with `""`. +@:@ + +### The `hide` Combinator +Sometimes, a parser should not appear in an error message. A good example is +whitespace, which is *almost* never the solution to any parsing problem, and +would otherwise distract from rest of the error content. The `hide` combinator +can be used to suppress a parser from appearing in the rest of a message: + +```scala mdoc:to-string +import parsley.errors.combinator.ErrorMethods + +(char('a') | digit.hide).parse("b") +``` + +## Error Adjustment Combinators +The previous combinators in this page have been geared at adding additional +richer information to the parse errors. However, these combinators are used to +adjust the existing information, mostly relating to position, to ensure the +error remains specific. + +### The `amend` Combinator +The `amend` combinator can adjust the position of an error message so that it +occurs at an earlier position. This means that it can be affected by other +combinators like `label` and `explain`. This is a precision tool, designed +for fine-tuning error messages. + +```scala mdoc:to-string +import parsley.errors.combinator.amend + +amend(digit *> char('a')).parse("9b") +``` + +Notice that the above error makes no sense. This is why `amend` is a precision +tool: it should ideally be used in conjunction with other combinators. For instance: + +```scala mdoc:silent +import parsley.implicits.character.charLift +import parsley.combinator.choice +import parsley.character.{noneOf, stringOfMany} + +val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\') +val strLetter = noneOf('\"', '\\').label("string char") | ('\\' ~> escapeChar).label("escape char") +val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"' +``` +```scala mdoc:to-string +strLit.parse("\"\\b\"") +``` + +In the above error, it is not *entirely* clear why the presented characters +are expected. Perhaps it would be better to highlight a correct escape +character instead? The `amend` combinator can be used in this case to pull +the error back and rectify it: + +```scala mdoc:silent:nest +val strLetter = noneOf('\"', '\\').label("string char") | + amend('\\' ~> escapeChar).label("escape char") +``` +```scala mdoc:invisible +val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"' +``` +```scala mdoc:to-string +strLit.parse("\"\\b\"") +``` + +While the `amend` has pulled the error back, and thanks to the `label` the +error is still sensible, it could be improved by widening the caret and +providing an explanation: + +```scala mdoc:silent:nest +import parsley.Parsley.empty +val escapeChar = choice('n'.as('\n'), 't'.as('\t'), '\"', '\\') | empty(2) +val strLetter = noneOf('\"', '\\').label("string char") | + amend('\\' ~> escapeChar).label("escape char") + .explain("escape characters are \\n, \\t, \\\", or \\\\") +``` +```scala mdoc:invisible +val strLit = '\"' ~> stringOfMany(strLetter) <~ '\"' +``` +```scala mdoc:to-string +strLit.parse("\"\\b\"") +``` + +Note, an `unexpected` could also have been used instead of `empty` to good effect. + +### The `entrench` and `dislodge` Combinators +The `amend` combinator will indiscriminately adjust error messages +so thay they occur earlier. However, sometimes only errors from some +parts of a parser should be repositioned. The `entrench` combinator +protects errors from within its scope from being amended, and +`dislodge` undoes that protection. + +### The `markAsToken` Combinator +The `markAsToken` combinator will assign the "lexical" property to any error messages that happen within its scope at a *deeper* position than the combinator +began at. This is fed forward onto the `unexpectedToken` method of the `ErrorBuilder`: more about this in [lexical extraction][Token Extraction in `ErrorBuilder`].