Skip to content

Extracting parsing results

Daus Salar edited this page Jul 7, 2015 · 3 revisions

Introduction

Being able to parse very complex inputs is fine, but if you parse such inputs in the first place, it is not only to ensure their validity! You also want to extract results from it.

And grappa makes this very easy; not only are raw parsing results directly available, but helper classes and mechanisms allow you to build Java objects directly out of those results, in pure Java.

This page will start with the basics: the parsing context. It will then explain how grappa can help you getting values out of the parser into real Java objects.

The parsing context and the value stack

When your rules run, they will have, at all time, information on the parsing context.

Information available include:

  • the input buffer, in its entirety;
  • the current offset into this buffer;
  • information about the last match.

And there is also..

The value stack

A value stack is attached to all contexts. The value stack has a type parameter. And this type parameter is also the one of your parser class. Therefore you are limited as to what you can store into/retrieve from the stack.

The stack has all classical operations available on a stack, except that the pure manipulation operations all return boolean true; therefore, you can do that in a rule:

Rule stackOp()
{
    return sequence("xx", swap(3));
}

In conclusion...

The parsing context is the "assembly of parser productions". The value stack happens to be useful sometimes, but ultimately its use is very limited.

Which is why there are other mechanisms. We will start with the lowest level mechanisms up to the highest level ones

Vars

Those are the second mechanism available to use. Vars are parametrized and they are basically mutable references.

In order for them to be compatible with rule actions, the basic Var operations to set a value, clear one etc all return a boolean.

But themselves are limited: you can only push to them values matching the type parameter you declared.

Which leads us to the following...

Event based parsing and value builders

And this is really the most efficient way to extract parser values. This is based on three concepts:

  • your rules are here to match text, and to ensure the correct flow of matches;
  • you have (generic) ValueBuilders swallowing matches and validating them on the go;
  • you post the result of these ValueBuilders on an event bus;
  • the event bus dispatches the built values.

To demonstrate the power of this concept, let us take a simple example. Say you have a file filled with geographic coordinates; you want to pull them all out to a Java object, which is for instance a List<Geo>.

What you do is a following:

  • your grammar validates that the file is well formed, but "only" well formed; that is, each line is a series of two digits, separated by a comma, and newlines indicate a new coordinate;
  • you create a ValueBuilder<Geo>; on a line, it will swallow the longitude and latitude -- and on each "swallow" it will validate correctness;
  • when a line is fully done, you post this ValueBuilder<Geo> to the bus, which will guild the Geo object and send it to the listening class.

And you're done!

All you have to do, and this will always be a compromise, will be to define a frontier as to how far your parser goes for validating the result and how far the ValueBuilder does.

Note that given the power of parsers, your ValueBuilder may, in fact, do nothing at all -- especially if its arguments are string values. The frontier is ultimately up to you to decide.