-
Notifications
You must be signed in to change notification settings - Fork 12
Parser basics
This page is meant as an introduction to the concepts you should know about parsers; you could of course jump directly to the examples, but you may not understand why parsers are written this way.
And this is where this page will help you...
There are three elements to a parser:
- the parser class itself;
- the rule methods (or rules; however, "rule methods" is more appropriate, see below);
- the context.
This is the class (or classes; more on this later) you create. It must extend either BaseParser
or (recommended) EventBusParser
. The difference between those is that the latter provides an event bus to dispatch parsing events to external classes.
A parser class is first and foremost a Java class; therefore, like any other class, it can have different constructors, instance variables, static variables, inner classes (static or not), etc etc. Typically:
// See below for the type parameter
public class MyParser
extends EventBusParser<Object>
{
// Constructors, methods, instance variables, rule methods etc
}
A rule method is a method in your parser class returning a Rule
. EventBusParser
and BaseParser
both provide a set of builtin rule methods which you will use for building your own rules (for instance, digit()
to match an ASCII digit, unicodeRange()
to match a range of Unicode code points etc).
Note that you are not limited to rule methods in your parser class. However, if your method returns a Rule
then it becomes a rule method.
Note that some rules accepting more than one argument can contain boolean expressions; such expressions however must not appear as the first argument. For instance:
Rule withBooleanTest()
{
return sequence(oneOrMore(alpha()), match().length() <= 10);
}
The context is what your parser will see at runtime; this context consists of three elements:
- the input text, plus the current offset in the text etc;
- information about the last match;
- and, finally, the value stack.
And this value stack has a type parameter. And this type parameter is also the one of your parser class. Therefore you are limited as to what you can store into/retrieve from the stack. However, this means you can do stuff like this as well:
// Base parser...
public abstract BaseValueParser<T extends MyBaseClass>
extends EventBusParser<T>
// Parser for one implementation...
public class ConcreteValueParser
extends BaseValueParser<ConcreteClass>
// etc