Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsley 5 API Consolidation #190

Open
32 of 34 tasks
j-mie6 opened this issue May 25, 2023 · 1 comment · May be fixed by #223
Open
32 of 34 tasks

Parsley 5 API Consolidation #190

j-mie6 opened this issue May 25, 2023 · 1 comment · May be fixed by #223
Labels
enhancement New feature or request major This change would affect break backwards compatibility
Milestone

Comments

@j-mie6
Copy link
Owner

j-mie6 commented May 25, 2023

Parsley 5 will provide an opportunity to address some of the pain points of the API once again. This list may evolve over time.

In addition to #172, #184, and #168:

Major Changes

  • Remove parsley.io._, and instead move the implicit class within object Parsley. This will go with a extends PlatformSpecific trick to allow for the io stuff to be enabled for jvm and native but not js but not require any imports. Rename parseFromFile to parse.
  • Flatten the Lexer hierarchy somewhat: the plain objects aren't really buying us much other than indirection: numeric, text etc can just go I think.
  • I want to rename attempt to atomic (after seeking feedback on this from other experts and users), since I think it makes much more sense as a name and clears up some misconceptions: this might even be done pre-5, but with the binary compatible change.
  • Remove >>=, getOrElse, join: >>= is provided by parsley-cats anyway, and is not something to encourage (same with join); and getOrElse is a misleading name, so this is going.
  • Remove attemptChoice, I don't want to encourage that in the API.
  • Remove TokenSpan from the API, and change LexToken to use offset instead of position, wayyyyyy simpler.
  • traverse to be curried, with function as second set of brackets
  • Change semantics of .hide so that it totally suppresses the errors underneath: Multiple labels with .label? #193 and Multiple labels in errors #198 have already laid the groundwork for this to happen, since label("") is no longer legal. Instead .hide should be its own combinator independent of label itself -- this is an outdated notion now. Additionally, clean up the various overloads: it should be a single label combinator requiring at least one label, all non-empty.
  • verifiedUnexpected -> verifiedExplain, with the argumentless variant remaining as is: new verifiedUnexpected variants to be introduced in 5.1?
  • genericbridges renamed to generic, we don't know what else might end up there, and the name is a bit nicer.
  • Americanise the Specialised error to Specialized
  • The LexToken can swap to the simpler definition from thesis, and TokenSpan can be removed.
  • many, some, and eof to move to Parsley -- they are definitely core, and I want to cut down on imports (this is in tandem with Generalising many and some #141)
  • remove more and between, they just aren't useful.
  • Minor adjustments to register extension classes.
  • If full result elimination is performed in dropped positions, we might actually want to consider removing the skip variants of the combinators, including skipMany, skipSome, skip, range_; but not forP, as this optimisation cannot recognise "dead registers".
  • Also, sequence should be given an extra argument, same with traverse. exactly should have an n > 1 invariant.
  • manyUntil and someUntil are unusual names, they should be called manyTill and someTill for consistency with all the other parsecs! (this change will affect parsley-cats, rename count and count1 to countMany and countSome.
  • Rework implicits packages/object into syntax? Gives us better "parity" with the typelevel style.
  • rename ifP, guard, when, and whileP to all use an S postfix
  • Change Integer, Real, String, Character to all have a Parsers suffix: it'll be better for clashes with Scala and important for gigaparsec.
  • Remove multiMap and singleMap in the Lexer config, they can just be merged into a mapping instead?
  • Turn predicate into a package, as opposed to an object.
  • Add additional parameter to Ops et al
  • Simplify the error config hierarchy so that it actually documents properly: no more hidden classes with companion objects! (where possible)
  • Remove parsley.implicits.combinator and move parsley.extension as parsley.syntax.extension
  • Rework the lexer config (and error config?) so that they use the forwards compatible case class mechanism. This way we can evolve it as we want, and if we ever reach a final state, we can unprivate the members and release it in full.
  • remove atomicChoice.
  • reinstate Failure as a proper case class again
  • add line info for the context lines in ErrorBuilder
  • make softKeyword have the maximal munch behaviour, like softOperator.

Minor Changes

  • Restore the partial amend semantics of at least filters. I don't think we want them on verified and prevent though. Note that the deprecated constructors in token.errors will be reinstated
  • Consider making a Scala 3 parsley "prelude" that allows for import parsley.quickstart.* or something to cut down on the parsley.Parsley, Parsley.*, parsley.character.*, parsley.combinator.* import pattern.
  • We can also build in an .all object in there to export out the zipped and lift syntaxes, perhaps?
@j-mie6 j-mie6 added enhancement New feature or request major This change would affect break backwards compatibility labels May 25, 2023
@j-mie6 j-mie6 added this to the Parsley 5 milestone May 25, 2023
@j-mie6
Copy link
Owner Author

j-mie6 commented Jul 15, 2023

I also want to deal with the tab situation properly.

Here's a sketch of an implementation that might work better:

  • expose parsley.PositionConfig, with def startLine: Int, def startCol: Int, def tabCol(col: Int): Int.
    Then, this could also expose a def updatePos(c: Char, pos: PosState): Unit (and one for codepoints, for Better Unicode Support for Position Updates #129?) with a default implementation using tabCol, checking for newline etc. This could be overriden by a user who wants the extra control over what characters do what.
  • PosState is an abstract class at the parsley level, but is sealed within parsley (abstract class PosState private [parsley]): this
    way we control the sub-classes, and the user doesn't know what they are. This exposes def getCol, def setCol, def getLine, def setLine, methods: these are called by updatePos. The Context implements this itself.
  • To aid JIT, we need to keep the dispatches to updatePos hot. This may involve keeping a helper in Context that forwards to the config, which then bounces back. We absolutely want JIT to specialise this to whatever the user has come up with.

Hopefully, this design gives maximal control to the user, and minimal overhead to the internals. There will be a new implicit parameter added to parse, which is the PositionConfig.

@j-mie6 j-mie6 linked a pull request Jan 12, 2024 that will close this issue
36 tasks
@j-mie6 j-mie6 linked a pull request Jan 12, 2024 that will close this issue
36 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request major This change would affect break backwards compatibility
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant