-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Batch File] Refactor labels and label-like comments #4005
Conversation
This commit starts implementing tokenization according to descriptions of https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts/4095133#4095133 for lables, which are also used as sort of line comments. It also highlights the odd parts, which includes not all valid looking labels being able to targeted in GOTO and CALL statements due to different tokenization rules being applied by dosbatch interpreter.
All sorts of words are terminated by common word boundaries (metachars), including path strings. It's obscure but `mkdir d1=d2` actually creates directories `d1` and `d2`.
This commit scopes `,` `;` and `=` punctuation.separator, if not otherwise addressed by more specific patterns to indicate their primary meaning.
This commit refactors some variable values to avoid nested charsets in patterns. Note: This commit does not change behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't handle the "first character in a line can be anything" rule. Is that intentional?
Any example of what's not working as expected would be helpful. Labels are treated special in some manners. Maybe I missed something. |
@echo off
for %%# in (1 2 3) do (call :label%%~#)
exit /b
@:label1 is a valid label unless in label context
$,=,= ,==, =,;;:,;=label2 see above, also 2>nul to suppress errors
* : label3 see above, also 2>nul to suppress errors
(echo(%0)
goto :EOF |
Well, I don't see any value in supporting all sorts of obscure expressions and doubt it is even possible. You'll find such edge cases in all syntaxes, but they are far away from "to be expected to be used". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Batch parsing is really weird and I definitely did not read the referenced SO post, but these look like good changes to me.
This PR applies tokenization rules from https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts/4095133#4095133 to labels and common command arguments.
As a result invalid labels are scoped "ignored" even without dedicated comment contexts and less special-casing patterns. They are however kept to maintain
comment.line
scopes for now.Delimiter characters such as
=
are ignored, which fixes scoping of somewhat uncommon but valid labels like:=== label ===
.Delimiter chars are scoped
punctuation.separator
if not having any other special meaning (e.g.:=
keyword.operator.assignment
) depending on context.Accuracy of various tokens - including some odds of batch - is increased to match highlighting better to interpreter's behavior.