Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce parser size and compile time #85

Merged
merged 5 commits into from
Apr 14, 2024

Conversation

maxbrunsfeld
Copy link
Contributor

@maxbrunsfeld maxbrunsfeld commented Apr 12, 2024

This PR makes a few optimizations to the grammar, to reduce the size of the generated parser, and also regenerates the parser with the latest tree-sitter CLI, which contains tree-sitter/tree-sitter#3234, an optimization to the compile time of the generated C code, which affects this parser significantly.

This should unlock compiling this parser to WASM and packaging it in a Zed extension. Compiling to wasm is still slow (about 45 seconds on my M3 MacBook). Compiling to aarch64 only takes 1.5 seconds. I still don't understand why clang is so much slower when compiling to wasm than when compiling to native.

I believe that the reason that this parser compiles so much slower than many others is that its generated ts_lex function has a large number of states, which I think is due to some complex tokens in this grammar:

I feel like there is probably a lot of room to simplify some of this, but I didn't attempt it in this PR.

/cc @fdncred

* Use a single token for duration, filesize units
* Use precedence to avoid ambiguity with if and try statements
@fdncred
Copy link
Collaborator

fdncred commented Apr 13, 2024

Thanks @maxbrunsfeld! We appreciate your support. Wow, 60 files changed. I didn't expect that. Please excuse my ignorance but what are all these extra files used for? There seem to be a lot of files we didn't have before.

@maxbrunsfeld
Copy link
Contributor Author

A lot of the changes files are just moving the corpus test folder under ‘test’, because that’s where the new tree-sitter CLI looks for them. There are also some binding files generated, to make the parser usable from Go, Python, and Swift I think.

@fdncred
Copy link
Collaborator

fdncred commented Apr 13, 2024

Can you fix the prettier lints with npx prettier grammar.js --write?

@maxbrunsfeld
Copy link
Contributor Author

Yes. I will; I'm also going to regenerate the parser one more time; I found a small improvement I'd like to make on the Tree-sitter side.

@fdncred
Copy link
Collaborator

fdncred commented Apr 14, 2024

Thanks. Let's move forward with this. If we decide we don't like the extra files (makefile, editor config, extra bindings), we can always remove them later. It's just software. :)

@fdncred fdncred merged commit 463314d into nushell:main Apr 14, 2024
3 checks passed
@maxbrunsfeld maxbrunsfeld deleted the smaller-parser branch April 14, 2024 19:13
@maxbrunsfeld
Copy link
Contributor Author

Yeah, tree-sitter generates those now, in case Python/go users want to use the parser. I agree, it is a lot of files.

I kind of agree that the editor config seems unnecessary. I might remove that file from the generate code.

@fdncred
Copy link
Collaborator

fdncred commented Apr 14, 2024

We really appreciate your help. ❤️

I also created a new PR to the zed-extension/nu repo in order to update to this version.

mrdgo pushed a commit to mrdgo/tree-sitter-nu that referenced this pull request Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants