Beyond the top-level grammar rule, which has to be supplied to a parsing run, and an action class template, which can be supplied to a parsing run, a third customisation point within the PEGTL allows the user to provide a control class to a parsing run.
(Actually the control class is a class template which takes a parsing rule as template argument, however in many cases there will be no specialisations, wherefore we will drop the distinction and pretend here that it is simply a class.)
Functions from the control class are called in strategic places during a parsing run and can be used to customise internal behaviour of the PEGTL and/or as debug aids. More precisely, the control class has static member functions to
- trace which rules are attempted to match where, and whether they succeed or fail,
- customise which exceptions are thrown in case of errors,
- customise how an action's
apply()
orapply0()
is called, - customise how a rule's
match()
is called.
The normal
control class template included with the PEGTL is used by default and shows which hook functions there are.
template< typename Rule >
struct normal
{
static constexpr bool enable; // See Meta-Data-and-Visit.md
template< typename ParseInput,
typename... States >
static void start( const ParseInput&, States&&... );
template< typename ParseInput,
typename... States >
static void success( const ParseInput&, States&&... );
template< typename ParseInput,
typename... States >
static void failure( const ParseInput&, States&&... );
template< typename ParseInput,
typename... States >
static void raise( const ParseInput&, States&&... );
template< typename Ambient,
typename... States >
static void raise_nested( const Ambient&, States&&... );
template< template< typename... > class Action,
typename Iterator,
typename ParseInput,
typename... States >
static auto apply( const Iterator& begin, const ParseInput& in, States&&... st )
-> decltype( ... );
template< template< typename... > class Action,
typename ParseInput,
typename... States >
static auto apply0( const ParseInput&, States&&... st )
-> decltype( ... );
template< apply_mode A,
rewind_mode M,
template< typename... > class Action,
template< typename... > class Control,
typename ParseInput,
typename... States >
static bool match( ParseInput& in, States&&... st );
};
The static member functions start()
, success()
and failure()
can be used to debug a grammar by using them to provide insight into what exactly is going on during a parsing run, or to construct a parse tree, etc.
There is one more, optional hook function: unwind()
.
It is called when a rule throws an exception, e.g. on global error.
It's signature is identical to start()
/success()
/failure()
.
It is not included in the default control template normal
, as the existence of an unwind()
method requires an additional try
/catch
block.
This might potentially have an impact on the binary size and therefore, one should only add an unwind()
method if necessary.
Several other control classes utilize the unwind()
method to track the execution even in the presence of global errors.
The static member function raise()
is used to create a global error, and any replacement must again throw an exception (or abort the application).
The static member function raise_nested()
is used to create a global error in cases where a previous error was caught and is to be included in the newly thrown exception as nested exception.
The static member functions apply()
and apply0()
can customise how actions with, and without, receiving the matched input are called, respectively.
Note that these functions should only exist or be visible when an appropriate apply()
or apply0
exists in the action class template.
This can be achieved via SFINAE, e.g. with a trailing return type as shown above.
The static member function match()
by default checks if there exists a suitable match()
in the action class template for the current rule. If so, it is called, otherwise it calls the main tao::pegtl::match()
function.
For debugging a grammar and tracing exactly what happens during a parsing run, the control class' start()
, success()
and failure()
can be used.
In addition, apply()
and apply0()
can be used to see which actions are invoked.
Before attempting to match a rule R
, the PEGTL calls C< R >::start()
where C
is the current control class template.
Depending on what happens during the attempt to match R
, one of the other three functions might be called.
- If
R
succeeds, thenC< R >::success()
is called; compared to the call toC< R >::start()
, the input will have consumed whatever the successful match ofR
consumed. - If
R
finishes with a local failure, i.e. a return value offalse
frommatch()
, thenC< R >::failure()
is called; a failed match must not consume input. - If
R
finishes with a global failure, i.e. an exception being thrown frommatch()
, then, if present,C< R >::unwind()
is called. - If
R
is wrapped inmust< R >
, a global failure is generated by callingC< R >::raise()
to throw some exception as is expected by the PEGTL in this case.
Additionally, if matching R
was successful and actions are enabled:
- If
C< R >::apply()
exists, thenC< R >::apply()
is called with the matched input and the current state arguments. - If
C< R >::apply0()
exists, thenC< R >::apply0()
is called with the current state arguments.
It is an error when both C< R >::apply()
and C< R >::apply0()
exist.
Note that the default C< R >::apply()
is SFINAE-enabled if A< R >::apply()
exists (where A
is the current action class template) and calls the latter. A control class might modify state, etc.
In case of actions that return bool
, i.e. actions where apply()
or apply0()
return bool
, C< R >::success()
is only called when both the rule and the action succeed.
If either produce a (local) failure then C< R >::failure()
is called.
In all cases where an action is called, the success or failure hooks are invoked after the action returns.
The included <tao/pegtl/contrib/trace.hpp>
gives a practical example that shows how the control class can be used to debug grammars.
The control hooks for exceptions, theraise()
and raise_nested()
static member functions, must both throw an exception.
For most parts of the PEGTL the exception class is irrelevant and any user-defined data type can be thrown by a user-defined control.
The try_catch_raise_nested
and try_catch_return_false
rules only catches exceptions of type tao::pegtl::parse_error_base
(or derived)!
When other exception types need to be caught then other members of the try_catch_*
family of rules need to be used.
The control's match()
is the first, outer-most function in the call-chain that eventually calls the rule's match()
.
For advanced use cases, it is possible to create a custom control class with a custom match()
that can change "everything" before calling the rule's match()
.
Similarly, the control's apply()
and apply0()
can customise action invocation; in particular apply()
can change how the matched portion of the input is passed to the action.
Just like the action class template, a custom control class template can be used (or changed) by either
- supplying it as explicit template argument to the
parse()
functions, or - setting it as control class with the
tao::pegtl::control
combinator, or - setting it as control class with the
change_control
action.
The latter requires the use of a custom action.
Deriving the specialisation of the custom action for my_rule
from tao::pegtl::change_control< my_control >
will switch the current control to my_control
before attempting to match my_rule
.
This document is part of the PEGTL.
Copyright (c) 2014-2023 Dr. Colin Hirsch and Daniel Frey
Distributed under the Boost Software License, Version 1.0
See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt