-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental parsing of visibly pushdown languages? #36
Comments
I'm not really familiar with incremental parsing techniques for context free languages, but one issue with owl's style of parsing is that nondeterminism can be resolved arbitrarily far in the future. For example, in this grammar:
whether a particular dot is part of the "even" rule or "odd" rule depends on the total number of dots in the sequence. Technically, the grammar is compiled into a deterministic automaton, but in order to make a parse tree, the path through the original nondeterministic automaton has to be recovered -- this is done in a backward pass over the recorded sequence of state transitions. (The parse tree construction actions are encoded as epsilon transitions of the nondeterministic automaton.) So actually, now that I'm thinking about it, I believe if you record the full sequence of states/transitions/parse tree construction actions for both for the forward and backward pass, you could do something like this:
The parse tree construction actions associated with the nondeterministic automaton transitions between these two sync points are the "new actions", replacing the "old actions" that were in the sequence between the two points before the edit. The parse tree construction actions before the first sync point and after the last sync point are unchanged. So the last thing to do is to incrementally update the tree itself based on the change to the sequence of construction actions. I'm not sure how to do this, but I'm sure someone has written about it somewhere. For single-character changes in the "dots" example, the sync points would always be the start and end of the input (so incrementality wouldn't really gain you anything), but for more realistic examples, hopefully the sync points would bracket the changes more precisely. As for getting owl itself to work incrementally, I'm not sure how hard that would be. At the very least, you'd need more API to handle the incremental changes. I could look into it more closely if you have a specific need for it. |
You said:
This comment is unrelated to incremental parsing, but I'm wondering: If I understood you correctly, you are doing a single forward pass during parsing to collect a sequence of transitions, and a backward pass to reconstruct the parse tree events from those transitions in reverse. Wouldn't it be possible to skip the "backward pass over the recorded sequence of state transitions" by storing a set of actions for each state transition of the deterministic automaton, and to emit them during the forward pass? That is, to merge both passes into one? |
It's not possible -- take the BTW, if you want to read more, this backward-pass technique for reconstructing the nondeterministic transitions from the deterministic ones comes from Danny Dubé's work on the SILex lexer generator. It's described more thoroughly in these two papers: http://www.iro.umontreal.ca/~feeley/papers/DubeFeeleyACTAINFORMATICA00.pdf http://www.schemeworkshop.org/2006/14-dube.pdf (these papers are where I learned about it). You run into the same issue parsing regular expressions (which makes sense, since |
Hello @ianh,
I'm wondering, do you have any insights into incremental parsing where VPLs could offer some kind of advantage over, e.g., parsing context free languages using, e.g., LR/LL parsing?
Or, specifically, how would you go about making owl work in an incremental fashion, that is, if owl has parsed some source once, and only small parts of it have changed, subsequent parses should not have to reparse everything from scratch.
I'd appreciate any thoughts you might have on this.
The text was updated successfully, but these errors were encountered: