Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifying parsing - eliminating EOI #162

Open
ZachJHansen opened this issue Nov 22, 2024 · 3 comments
Open

Simplifying parsing - eliminating EOI #162

ZachJHansen opened this issue Nov 22, 2024 · 3 comments
Assignees
Labels
A-parsing Area: Parsing C-enhancement Category: Enhancement E-medium Experience: Medium L-asp Language: Answer Set Programming L-fol Language: First-order logic P-low Priority: Low

Comments

@ZachJHansen
Copy link
Collaborator

Currently the grammar and parsing relies on "something_eoi" rules to parse "something." This could be simplified by adding "~ !ANY" to the end of "something" rules and eliminating the EOI rules. See the definition of keywords in the FOL grammar for an example.

@ZachJHansen ZachJHansen added P-low Priority: Low E-medium Experience: Medium C-enhancement Category: Enhancement A-parsing Area: Parsing L-asp Language: Answer Set Programming L-fol Language: First-order logic labels Nov 22, 2024
@janheuer
Copy link
Collaborator

janheuer commented Dec 5, 2024

I have tried this approach a bit now, only for the ASP grammar, see the branch jan/parsing_eoi. Here are my findings:

  • I have kept the EOI rules for now but just started with replacing EOI in them with !ANY
  • This causes all ASP parsing test to fail
  • I think the call pairs.next_back() in parsing/mod.rs line 51 needs to be removed then, as there is no EOI anymore and the comment in that line states that the purpose is to remove EOI
  • Then almost all parsing test for ASP pass, only parse_program fails
  • For parse_program it fails because of the test case "a.\n". Here the trailing newline is not parsed anymore.
  • If the program rule still uses EOI instead of !ANY this would work again if the pairs.next_back() is added again which of course causes all the component tests to fail again. I'm not sure if it would be possible to only have this call when parsing a program?

@teiesti
Copy link
Collaborator

teiesti commented Dec 6, 2024

Your remarks make me confident that we can actually implement this. Nice work.

I think the call pairs.next_back() in parsing/mod.rs line 51 needs to be removed then, as there is no EOI anymore and the comment in that line states that the purpose is to remove EOI

FYI, this was the last piece I was missing. Good catch.

Then almost all parsing test for ASP pass, only parse_program fails

Once you integrate

program = { rule* }
program_eoi = _{ program ~ !ANY }

into

program = { rule* ~ !ANY }

and adjust the rule used in the ProgramParser, the problem vanishes. I think the reason for this behavior is hidden somewhere in here. Maybe you can think about this. I have pushed my changes to tobias/parsing_eoi so that you have the details.

I have kept the EOI rules for now but just started with replacing EOI in them with !ANY

Can you finish all the chances for the ASP and FOL parsing and prepare a pull request? This includes removing all the EOI rules.

@janheuer
Copy link
Collaborator

janheuer commented Jan 9, 2025

I was able to remove the EOI rules for all the top-level rules, i.e. program, theory, specification and user guide. However I can not remove them for all the other rules. As it is right now we have e.g. a rule for formula and then the formula_eoi rule. The formula rule is the one that is then used to define theories and merging formula_eoi and formula breaks the test for program as now theory is

{ (formula ~ !ANY ~ ".")* ~ !ANY }

as opposed to what we have with the separate rules

{ (formula ~ ".")* ~ !ANY }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-parsing Area: Parsing C-enhancement Category: Enhancement E-medium Experience: Medium L-asp Language: Answer Set Programming L-fol Language: First-order logic P-low Priority: Low
Projects
None yet
Development

No branches or pull requests

3 participants