-
Notifications
You must be signed in to change notification settings - Fork 12
how it works
I am still discovering things as I write this document; this relates my experience so far with what happens and when, and it is by no means an exhaustive explanation.
Still, it may help, so read on...
When you write a parser, you basically write a set of Rule
s, as in:
public class MyParser
{
Rule rule1()
{
return ch('x');
}
Rule rule2()
{
return sequence('x', "y");
}
}
So, first things first...
Like this:
final MyParser parser = Parboiled.createParser(MyParser.class);
This single line of code does quite a lot; and this "quite a lot" includes some black magic too.
Before we delve into this black magic, let us filter out what is not black magic. Starting with:
This is because some methods producing rules (sequence()
is one of them, but so is, for instance, firstOf()
) accept Object
s as argument; these arguments will, in turn, call the .toRule()
or .toRules()
methods. Both of these method will yield the appropriate matchers depending on the real type of the objects.
Here again it is the toRule{,s}
method which help. However, for expressions/methods/etc returning booleans there is a further treatment at parser build time; basically, all boolean expressions in your rules will be turned into Action
s.
This particular part of the process is part of the black magic mentioned earlier. So, now, let us delve into that...
What this method does is create a subclass of your parser class; for instance, if your parser class has a fully qualified name of com.mycompany.xx.MyParser
, this method will create a class named com.mycompany.xx.MyParser$$parboiled
. Note that this means your parser class cannot be final
.
The process by which this class is created is very, very low level: it consists, in its entirety, of bytecode inspections and transformations.
Yes, your eyes didn't deceive you. Now, how is this all done? Well, parboiled didn't document this! However, Grappa does...
Not that it explains much, does it? ;) So, here is a quick overview:
-
ASM is used extensively; it "swallows" your parser class into a
ParserClassNode
; - in turn, the
ParserClassNode
identified a number ofRuleMethod
s (basically, those are all methods in your grammar returningRule
s); - the bytecode transformation will occur on the rule methods which need to be modified..
- ... and finally a new instance of your parser is created.