Jsleri is an easy-to-use language parser for JavaScript.
Using npm:
$ npm i jsleri
In your project:
import * as jsleri from 'jsleri';
// Exposes:
// - jsleri.version
// - jsleri.noop
// - jsleri.Keyword
// - jsleri.Regex
// - jsleri.Token
// - jsleri.Tokens
// - jsleri.Sequence
// - jsleri.Choice
// - jsleri.Repeat
// - jsleri.List
// - jsleri.Optional
// - jsleri.Ref
// - jsleri.Prio
// - jsleri.THIS
// - jsleri.Grammar
// - jsleri.EOS
Or... download the latest release from here and load the file in inside your project. For example:
<!-- Add this line to the <head> section to expose window.jsleri -->
<script src="jsleri-1.1.14.min.js"></script>
- pyleri: Python parser (can export grammar to pyleri, cleri and jsleri)
- libcleri: C parser
- goleri: Go parser
- jleri: Java parser
import { Regex, Keyword, Sequence, Grammar } from 'jsleri';
// create your grammar
class MyGrammar extends Grammar {
static START = Sequence(
Keyword('hi'),
Regex('(?:"(?:[^"]*)")+')
);
}
// create a instance of your grammar
const myGrammar = new MyGrammar();
// do something with the grammar
alert(myGrammar.parse('hi "Iris"').isValid); // alerts true
alert(myGrammar.parse('hello "Iris"').isValid); // alerts false
When writing a grammar you should subclass Grammar. A Grammar expects at least a START
property so the parser knows where to start parsing. Grammar has a parse method: parse()
.
syntax:
myGrammar.parse(string)
The parse()
method returns a result object which has the following properties that are further explained in Result:
expecting
isValid
pos
tree
The result of the parse()
method contains 4 properties that will be explained next.
isValid
returns a boolean value, True
when the given string is valid according to the given grammar, False
when not valid.
node_result.isValid) # => False
Let us take the example from Quick usage.
alert(myGrammar.parse('hello "Iris"').isValid); // alerts false
pos
returns the position where the parser had to stop. (when isValid
is True
this value will be equal to the length of the given string with str.rstrip()
applied)
Let us take the example from Quick usage.
alert(myGrammar.parse('hello "Iris"').pos); // alerts 0
tree
contains the parse tree. Even when isValid
is False
the parse tree is returned but will only contain results as far as parsing has succeeded. The tree is the root node which can include several children
nodes. The structure will be further clarified in the example found in the "example" folder. It explains a way of visualizing the parse tree.
The nodes in the example contain 5 properties:
start
property returns the start of the node object.end
property returns the end of the node object.element
returns the type of Element (e.g. Repeat, Sequence, Keyword, etc.).string
returns the string that is parsed.children
can return a node object containing deeper layered nodes provided that there are any. In our example the root node has an element typeRepeat()
, starts at 0 and ends at 24, and it has twochildren
. These children are node objects that have both an element typeSequence
, start at 0 and 12 respectively, and so on.
expecting
returns an array containing elements which jsleri expects at pos
. Even if isValid
is true there might be elements in this object, for example when an Optional()
element could be added to the string. Expecting is useful if you want to implement things like auto-completion, syntax error handling, auto-syntax-correction etc. In the "example" folder you will find an example. Run the html script in a browser. You will see that expecting
is used to help you create a valid query string for SiriDB. SiriDB is an open source time series database with its own grammar class. Start writing something, click one of the options that appear and see what happens.
Jsleri has several Elements which can be used to create a grammar.
Keyword(keyword, ignCase)
The parser needs to match the keyword which is just a string. When matching keywords we need to tell the parser what characters are allowed in keywords. By default Jsleri uses ^\w+
which equals to ^[A-Za-z0-9_]+
. Keyword() accepts one more argument ignCase
to tell the parser if we should match case insensitive.
Example:
const grammar = new Grammar(
Keyword('tic-tac-toe', true), // case insensitive
'[A-Za-z-]+' // alternative keyword matching
);
console.log(grammar.parse('Tic-Tac-Toe').isValid); // true
Regex(pattern, ignCase)
The parser compiles a regular expression. Argument ignCase is set to false
by default but can be set to true
if you want the regular expression to be case insensitive. Note that ignore case
is the only re
flag from pyleri which will be compiled and accepted by jsleri
.
See the Quick Usage example for how to use Regex
.
Token(token)
A token can be one or more characters and is usually used to match operators like +
, -
, //
and so on. When we parse a string object where jsleri expects an element, it will automatically be converted to a Token()
object.
Example:
// We could just write '-' instead of Token('-')
// because any string will be converted to Token()
const grammar = new Grammar(List(Keyword('ni'), Token('-')));
console.log(grammar.parse('ni-ni-ni-ni-ni').isValid); // true
Tokens(tokens)
Can be used to register multiple tokens at once. The tokens argument should be a string with tokens separated by spaces. If given tokens are different in size the parser will try to match the longest tokens first.
Example:
const grammar = new Grammar(List(Keyword('ni'), Tokens('+ - !=')));
grammar.parse('ni + ni != ni - ni').isValid // => True
Sequence(element, element, ...)
The parser needs to match each element in a sequence.
Example:
const grammar = new Grammar(Sequence(
Keyword('Tic'),
Keyword('Tac'),
Keyword('Toe')
));
console.log(grammar.parse('Tic Tac Toe').isValid); // true
Repeat(element, mi, ma)
The parser needs at least mi
elements and at most ma
elements. When ma
is set to undefined
we allow unlimited number of elements. mi
can be any integer value equal or higher than 0 but not larger then ma
. The default value for mi
is 0 and undefined
for ma
Example:
const grammar = new Grammar(Repeat(Keyword('ni')));
console.log(grammar.parse('ni ni ni ni').isValid); // true
One should avoid to bind a name to the same element twice and Repeat(element, 1, 1) is a common solution to bind the element a second (or more) time(s).
For example consider the following:
const r_name = Regex('(?:"(?:[^"]*)")+');
// Do NOT do this
const r_address = r_name; // WRONG
// Instead use Repeat
const r_address = Repeat(r_name, 1, 1); // Correct
List(element, delimiter, mi, ma, opt)
List is like Repeat but with a delimiter. A comma is used as default delimiter but any element is allowed. When a string is used as delimiter it will be converted to a Token element. mi
and ma
work excatly like with Repeat. opt
kan be set to set to true
to allow the list to end with a delimiter. By default this is set to false
which means the list has to end with an element.
Example:
const grammar = new Grammar(List(Keyword('ni')));
console.log(grammar.parse('ni, ni, ni, ni, ni').isValid); // true
Optional(element)
The parser looks for an optional element. It is like using Repeat(element, 0, 1)
but we encourage to use Optional
since it is more readable. (and slightly faster)
Example:
const grammar = new Grammar(Sequence(
Keyword('hi'),
Optional(Regex('(?:"(?:[^"]*)")+'))
));
console.log(grammar.parse('hi "Iris"').isValid); // true
console.log(grammar.parse('hi').isValid); // true
Ref(Constructor)
The grammar can make a forward reference to make recursion possible. In the example below we create a forward reference to START but note that a reference to any element can be made.
Warning: A reference is not protected against testing the same position in in a string. This could potentially lead to an infinite loop. For example:
let r = Ref(Optional); r.set(Optional(r)); // DON'T DO THISUse Prio if such recursive construction is required.
Example:
// make a forward reference START to a Sequence.
let START = Ref(Sequence);
// we can now use START
const ni_item = Choice(Keyword('ni'), START);
// here we actually set START
START.set(Sequence('[', List(ni_item), ']'));
// create and test the grammar
const grammar = Grammar(START);
console.log(grammar.parse('[ni, [ni, [], [ni, ni]]]').isValid); // true
Prio(element, element, ...)
Choose the first match from the prio elements and allow THIS
for recursive operations. With THIS
we point to the Prio
element. Probably the example below explains how Prio
and THIS
can be used.
Note: Use a Ref when possible. A
Prio
element is required when the same position in a string is potentially checked more than once.
Example:
const grammar = new Grammar(Prio(
Keyword('ni'),
Sequence('(', THIS, ')'),
Sequence(THIS, Keyword('or'), THIS),
Sequence(THIS, Keyword('and'), THIS)
));
console.log(grammar.parse('(ni or ni) and (ni or ni)').isValid); // true