Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in LALR Grammar Parsing #1489

Open
Crayon112 opened this issue Nov 16, 2024 · 0 comments
Open

Difference in LALR Grammar Parsing #1489

Crayon112 opened this issue Nov 16, 2024 · 0 comments
Labels

Comments

@Crayon112
Copy link

Description

I am new to using LALR and I'm attempting to implement a grammar as follows:

A: "a"
B: "b"
a: b A
b: (A | B)+

However, when I try to parse the input string aaaba, I encounter the following error:

lark.exceptions.UnexpectedToken: Unexpected token Token('$END', '') at line 1, column 5.
Expected one of:
        * A
        * B

It seems that rule b is greedy in this case, despite my attempts to set a higher priority for the grammar rule a.

Proposed Solution

After making a simple modification to the grammar, I was able to get it working as follows:

A: "a"
B: "b"
a: b+ A
b: A | B

This adjustment resolved the error for the aforementioned input string.

Questions

What is the key difference between the two grammar definitions:

  • a: b+ A and b: A | B
  • a: b A and b: (A | B)+

How can I avoid the greediness in the first grammar configuration, specifically when defining rules for b? Any insights or guidance on this issue would be greatly appreciated. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant