Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bash AST parser #61

Open
dthree opened this issue Mar 10, 2016 · 33 comments
Open

Bash AST parser #61

dthree opened this issue Mar 10, 2016 · 33 comments

Comments

@dthree
Copy link

dthree commented Mar 10, 2016

I'm looking for someone knowledgeable in PEGs (DSL parser), and specifically PEG.js that would like to help finish a project.

js-shell-parse is a dropped project that is almost complete, which parses bash into an AST. I am trying to fork and complete this project, which would be the foundation of implementing a full cross-platform bash interpreter into Vorpal and Cash.

Doing this would allow you to build cross-platform, interactive CLIs with full bash support (all redirections, substitutions, control words, etc). If this is accomplished, it is very likely that Cash (and the module being discussed) will ship with NPM in the future as the package script interpreter.

Any PEG masters out there interested?

@sindresorhus
Copy link
Owner

// @wooorm @Qix-

@iiison
Copy link

iiison commented Mar 10, 2016

I've not worked on PEG.js, but up for it... Can I help...

@dthree
Copy link
Author

dthree commented Mar 10, 2016

Sure, as long as you can take the time make sense out of this:

https://github.com/grncdr/js-shell-parse/blob/master/grammar.pegjs

Parsing DSLs is not easy and takes a lot of study, but if you're up to it I would love your help.

@dstack
Copy link

dstack commented Mar 10, 2016

willing to take a look, have worked with PEG.js in the past, parsed several languages. The real question is what is missing?

@dthree
Copy link
Author

dthree commented Mar 10, 2016

Will compile a list shortly.

@iiison
Copy link

iiison commented Mar 10, 2016

@dthree will take a look.

@dthree
Copy link
Author

dthree commented Mar 10, 2016

👍

@forivall
Copy link

I'll also be taking a look; i worked with flex / bison back in uni, and i'd love to delve into pegjs

@parro-it
Copy link

I was working on a project similar to cash one year ago.

I used Jacob for the grammar stuff. I started trying to make sense of an existing grammar, but later I decided to start from scratch because it was too much stuff to make sense of...

Later on I abandoned the project because it was too big to do it alone.
I'll be more then happy to help you 😺 (but I'm far to be an expert of grammars).

As fair as I remember, the project was able to parse some basic commands and operators

| || && > >>

If you think it could be useful, I can move the source to GH. It is actually on a private repo on bitbucket.

@dthree
Copy link
Author

dthree commented Mar 10, 2016

@parro-it nice!

js-shell-parse is actually really close, so I think the best efforts would be to focus on wrapping it up.

@parro-it
Copy link

I agree with you @dthree. I didn't understood if you plan to contribute to the js-shell-parse repo or if you created a new one from that code. Did you try to contact @grncdr?

@Qix-
Copy link
Collaborator

Qix- commented Mar 10, 2016

I can help, but I'm packing for my move today.

Or should I say, packratting (I'll see myself out...)

@dthree
Copy link
Author

dthree commented Mar 10, 2016

@parro-it I attempted and its radio silence. There's also been issues filed over time and no response in two years. Will probably fork and then keep him in the license.

@dthree
Copy link
Author

dthree commented Mar 10, 2016

@Qix- nice - maybe after move? Where you going? :)

@Qix-
Copy link
Collaborator

Qix- commented Mar 10, 2016

Sure. And I move to San Fran tomorrow.

@parro-it
Copy link

@dthree I think that's fair, I also saw the issues... I'm looking now at the grammar, it's not simple but could be afforded.

@dthree
Copy link
Author

dthree commented Mar 10, 2016

@Qix- sweet!

@parro-it 👍

@grncdr
Copy link

grncdr commented Mar 11, 2016

cool, so I'm super stoked if somebody wants to fork & finish js-shell-parse, but I should warn you that the operational semantics of posix shells (or indeed most streaming input) does not mesh super well with PEG. There's a hack for this in https://github.com/grncdr/js-shell-frontend where it looks at the PEG.js syntax errors and figures out if you can continue parsing with more input.

TBH if I were to start again, I would look at doing this with a streaming parser-combinator library. PEG is super nice for relatively simple languages where you can expect to parse all input at once, but this is not how shells work. The semantics and syntax of POSIX shells are deeply intertwined, so spend some time reading the POSIX specs before you commit to resurrecting js-shell-parse.

Some of these things might be non-issues if your goal is just to execute npm scripts, if that's all you want js-shell-parse is probably good enough already. However, if you want to do interactive shells, your also going to need to drop down into C for doing job control and properly managing the TTY. I had a PoC interactive shell based on shell-frontend at some point, but I can't seem to find the repo for it now. In any case, a lot of things didn't really work right, and I ran out of enthusiasm for the project. If you want to know more just ask here.

@dthree
Copy link
Author

dthree commented Mar 11, 2016

@grncdr you're alive! 🎉


Thanks for your advice and understood on all of it.

There's a lot right about js-shell-parse as well. The current priority on support is things like redirection, expansions, variables and basic flow control. You seem to have these things pretty taped.

From what I can tell, what you're having trouble with is more advanced flow control (functions, if / else, loops, etc.). Am I correct? These things are less a priority at the moment, as, like you said, the main public for this is a. package scripts, and b. an interactive shell (single liners).


It would be amazing if you could possibly do a little turn-over on the repo, perhaps by listing what types of things are implemented and what was giving you unsolvable trouble. You're loaded with experience and this would save a lot of time!

@grncdr
Copy link

grncdr commented Mar 11, 2016

From what I can tell, what you're having trouble with is more advanced flow control (functions, if / else, loops, etc.). Am I correct? These things are less a priority at the moment, as, like you said, the main public for this is a. package scripts, and b. an interactive shell (single liners).

Not exactly, most of the parsing problems are relatively straightforward and while I'm sure there's bugs or missing things most of them should be pretty easy to add to the parser (e.g. the requested >| redirection). Where things got really stuck is implementing an interactive shell properly, most of the posix spec is written in terms of a streaming/character based parser.

It would be amazing if you could possibly do a little turn-over on the repo, perhaps by listing what types of things are implemented and what was giving you unsolvable trouble. You're loaded with experience and this would save a lot of time!

Would you be up for setting up & recording a google hangout or skype call this weekend? I'm free Sunday, and that would give me some time to look over the grammar & code again. Recording it means that you have something a bit more concrete to refer back to, but I won't need to spend quite so much time as I would if I were to write something.

@gabrielcsapo
Copy link

Has anyone gotten to a point of generating an AST from a bash script and then being able to generate code from that AST?

@Qix-
Copy link
Collaborator

Qix- commented Jun 10, 2017

Oh boy this got lost by the wayside. Really wishing I had @RemindMe right now.

@gabrielcsapo
Copy link

@Qix- I built https://github.com/gabrielcsapo/shell-p to try and get introspection on the running time of various parts of a shell script, but js-shell-parser and bash-parser seem to have issues with 1. parsing function and 2. generating code form an AST

@parro-it
Copy link

@gabrielcsapo I answer you issue on bash-parser repo there.

Regarding 1., if you need to parse bash specific function syntax, there is a task list issue keeping track of bash specifics syntax to implement. I could prioritize the function syntax implementation if you need it.

@parro-it
Copy link

parro-it commented Jun 10, 2017

@Qix- we discussed the status of the project with @dthree here if you are interested 😸
The project is really difficult and we need more help !!

@gabrielcsapo
Copy link

@parro-it that would be awesome if you could prioritize that! I will take a look at the issue in the morning and work on generating code from the AST generated by bash-parser

@mvdan
Copy link

mvdan commented Apr 9, 2018

I'm a bit late to the party, but I built this very thing a couple of years ago: https://github.com/mvdan/sh

It contains a parser (code to AST), a printer (AST to code), and even an experimental interpreter to run an AST. All of those, with full support for POSIX Shell, Bash and mksh. It's been battle-tested with lots of code and edge cases for over two years, so I'd be surprised if you could find any real code to make it fail :)

I know it's written in Go, but transpiling Go to JS has been done before: https://github.com/gopherjs/gopherjs

It also seems like the language will get wasm support soon, so that might make things even easier: golang/go#18892

If any of you would like to quickly play with it, see shfmt -tojson some-file.sh.

@stefanmaric
Copy link

I'm a bit late to the party, but I built this very thing a couple of years ago: https://github.com/mvdan/sh

Random, off-topic anecdote: I got a notification for this comment about 1 hour after I stared that repo while I was looking for a bash formatter for a go version manager - small world. 😄

@parro-it
Copy link

It happens I give you a star long time ago... I guess I did't consider to use it bevause go.
Anyway, I am really curious to see how you solve the problems I encontered developing bash-parser.
Did you follow the Poslx standard in a strict way?

@parro-it
Copy link

BTW, giur anyone interested bash-parser is here https://github.com/vorpaljs/bash-parser
But I fear you can find MANY bugs there...

@mvdan
Copy link

mvdan commented Apr 10, 2018

Did you follow the Poslx standard in a strict way?

Yes. See the caveats section of the README for the few tradeoffs I had to make. Otherwise, you can assume that POSIX and Bash are both fully supported.

I'll try to have a simple javascript module published by this weekend, with a subset of the API exposed to do the basic stuff.

@mvdan
Copy link

mvdan commented Apr 14, 2018

I believe I have a working version of the parser in a JS module: https://www.npmjs.com/package/mvdan-sh

This is the glue code and the package.json: https://github.com/mvdan/sh/tree/master/_js

It should work, as the testmain.js file in there works with the index.js that is packaged in the module. However, I have no idea how to actually use a JS module, so I'm not 100% sure that the module will just work. Please give it a go and let me know.

I'll add more parts of the Go library to the JS module soon, as well as better docs :)

@mvdan
Copy link

mvdan commented Apr 26, 2018

I have been working on this over the last couple of weeks. Now the JS package is at a stage where it's useful. You can get the parse tree, print it out for debugging purposes, walk it, modify it, and convert it into source code again.

See the README on the npm package link I pasted above; it contains a sample JS program that shows all of the above. Hopefully that is enough to get people started.

If you find any bugs, or any features are missing, please raise issues on the mvdan/sh repository. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests