Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional Semicolons #447

Open
ScottFreeCode opened this issue Jun 3, 2023 · 5 comments
Open

Optional Semicolons #447

ScottFreeCode opened this issue Jun 3, 2023 · 5 comments

Comments

@ScottFreeCode
Copy link

Is it possible to create a syntax in LBNF in which, given a statement can have parts <one> and <two>, the following are all legal syntax?

<one>
<two>;
<one> <two>; <one> <two>;
<one> <two>
<one> <two>
<one>
<two>
<one>
<two>

Unlike layout, I do not have keywords at which open/close brackets can be inserted, and I do no necessarily want tab/alignment to be required for correct interpretation.

I simply want the parser to say, when it encounters a new line, something like:

  1. Is this a complete statement?
  2. And is the following non-empty line a complete statement?
  3. If yes, treat them as two statements.
  4. If no, combine them into one as you normally would if the newline were whitespace, and evaluate from there.

(Similar to, say, JavaScript's "semicolon insertion" rule. Semicolons are generally required only where the beginning of a statement could also be interpreted as the continuation of the previous statement, or to write statements on one line. But Haskell's or Python's whitespace-based layout is not used. Just a rule that a line break can end a complete statement. Maybe similar to shell script?)

I tried this grammar:

entrypoints [Statement];

--Statements . Statements ::= [Statement];

Statement . Statement ::= "hello" "world" OptionalSemicolon;

Semicolon . OptionalSemicolon ::= ";";

Blank . OptionalSemicolon ::= "\n";

terminator Statement "";

Which successfully makes semicolons optional! BUT it doesn't require a newline in the absence of a semicolon. It acts as though I had written "" instead of "\n". So e.g. this parses (which it should not, it should require a semicolon or a newline):

hello world hello world

And this:

hello
world
hello
world

…which should parse and then print back as this:

hello world
hello world

…does parse but instead prints back as this:

hello world hello world

(If I were to, say, add a semicolon at the end of either of these examples, it would still parse and either would print hello world hello world ; – The semicolon seems to work fine when it is present.)

I tried this modification (replacing the grammar line beginning with Blank):

Blank . OptionalSemicolon ::= Newline;

token Newline '\n';

But the effect is that the semicolon is required.

@ScottFreeCode
Copy link
Author

This is clearly not a dealbreaker, I could go with mandatory semicolons or figure out a way to use layout even if that seems opinionated.

But – the fact that the "\n" seems to be getting translated into the same thing as "" i.e. some whitepsace separation required but can be any space and prints as a single space, rather than either:

  • actually requiring and printing the specified character, or else
  • choking and saying "Don't include whitespace in your tokens!!"

…seems like a bug. Having the character I explicitly specified be accepted but treated as other characters, is definitely unexpected (or at least unintuitive) behavior.

@andreasabel
Copy link
Member

There is work in progress by @beataburreau on a new implementation of BNFC where one has a newline special token.

@praduca
Copy link

praduca commented Jul 29, 2023

I think there is some bug about using semicolons somewhere... I'm trying to make a tinybasic grammar, but when i use a semicolon as a separator (like "PRINT A$;B$" ) it parses ok but the prettyprinter put every part on a different line. Changing the separator to a comma works fine...

@andreasabel
Copy link
Member

@praduca: There is same hard-writing in the render function of the generated printer biased towards "braces and semicolon" style languages. If you want some other rendering, you have to patch the generated printer.

@praduca
Copy link

praduca commented Jul 30, 2023

Ah good to know it is something simple. Thanks for commenting so quickly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants