Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Batch File] Refactor labels and label-like comments #4005

Merged
merged 4 commits into from
Aug 26, 2024

Conversation

deathaxe
Copy link
Collaborator

@deathaxe deathaxe commented Jul 9, 2024

This PR applies tokenization rules from https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts/4095133#4095133 to labels and common command arguments.

As a result invalid labels are scoped "ignored" even without dedicated comment contexts and less special-casing patterns. They are however kept to maintain comment.line scopes for now.

Delimiter characters such as = are ignored, which fixes scoping of somewhat uncommon but valid labels like :=== label ===.

Delimiter chars are scoped punctuation.separator if not having any other special meaning (e.g.: = keyword.operator.assignment) depending on context.

Accuracy of various tokens - including some odds of batch - is increased to match highlighting better to interpreter's behavior.

deathaxe added 4 commits July 9, 2024 21:12
This commit starts implementing tokenization according to descriptions of
https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts/4095133#4095133
for lables, which are also used as sort of line comments.

It also highlights the odd parts, which includes not all valid looking labels
being able to targeted in GOTO and CALL statements due to different
tokenization rules being applied by dosbatch interpreter.
All sorts of words are terminated by common word boundaries (metachars),
including path strings.

It's obscure but `mkdir d1=d2` actually creates directories `d1` and `d2`.
This commit scopes `,` `;` and `=` punctuation.separator, if not otherwise
addressed by more specific patterns to indicate their primary meaning.
This commit refactors some variable values to avoid nested charsets in patterns.

Note: This commit does not change behavior.
Copy link
Contributor

@mataha mataha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't handle the "first character in a line can be anything" rule. Is that intentional?

@deathaxe
Copy link
Collaborator Author

Any example of what's not working as expected would be helpful. Labels are treated special in some manners. Maybe I missed something.

@mataha
Copy link
Contributor

mataha commented Jul 20, 2024

@echo off

for %%# in (1 2 3) do (call :label%%~#)
exit /b

@:label1 is a valid label unless in label context

$,=,= ,==, =,;;:,;=label2 see above, also 2>nul to suppress errors

*       :          label3 see above, also 2>nul to suppress errors

(echo(%0)
goto :EOF

@deathaxe
Copy link
Collaborator Author

deathaxe commented Jul 20, 2024

Well, I don't see any value in supporting all sorts of obscure expressions and doubt it is even possible. You'll find such edge cases in all syntaxes, but they are far away from "to be expected to be used".

Copy link
Collaborator

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch parsing is really weird and I definitely did not read the referenced SO post, but these look like good changes to me.

@deathaxe deathaxe merged commit 80f1a61 into sublimehq:master Aug 26, 2024
2 checks passed
@deathaxe deathaxe deleted the pr/batch/refactor-labels branch August 26, 2024 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants