Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for special table lookups (subnet or pattern index types) #221

Merged
merged 2 commits into from
Nov 21, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions script-reference/types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1113,6 +1113,66 @@ Membership can be tested with ``in`` or ``!in``:
See the :zeek:keyword:`for` statement for information on how to iterate over
the elements in a table.

.. _table-special-lookups:

Special lookups
^^^^^^^^^^^^^^^

Zeek supports two forms of special table lookups. The first is for tables
with an index type of :zeek:type:`subnet`. When indexed with an
:zeek:type:`addr` value, these tables produce the yield associated with
the closest (narrowest) subnet. For example:

.. code-block:: zeek

global st: table[subnet] of count;
st[1.2.3.4/24] = 5;
st[1.2.3.4/29] = 9;
print st[1.2.3.4], st[1.2.3.251];

will print ``9, 5``. Attempting to look up an address that doesn't match
any of the subnet indices results in a run-time error.

awelzel marked this conversation as resolved.
Show resolved Hide resolved
.. versionadded:: 6.2

In addition, :zeek:type:`string` lookups for tables that have an index type of
:zeek:type:`pattern` return a (possibly empty)
:zeek:type:`vector` containing the values corresponding to each of the
patterns matching the given string. The order of entries in the resulting
vector is non-deterministic. For example:

.. code-block:: zeek

global pt: table[pattern] of count;
pt[/foo/] = 1;
pt[/bar/] = 2;
pt[/(foo|bletch)/] = 3;
print pt["foo"];

will print either ``[1, 3]`` or ``[3, 1]``.
Indexing with a string that matches only one pattern returns a
one-element :zeek:type:`vector`, and indexing with a string that no
pattern matches returns an empty :zeek:type:`vector`.

Note that these pattern matches are all *exact*: the pattern must match
the entire string. If you want the pattern to match if it's *anywhere*
in the string, you can use the usual regular expression operators such
as ``/.*foo.*/``.

.. note::

Internally, the individual patterns are matched in parallel using a lazily
constructed determinstic finite automaton (DFA). Depending on the nature of
patterns used within the table *and* the input data used for lookups, this
may result in difficult to predict memory consumption over time.

Users are advised to test scripts using this feature with realistic and
adversarial input data with focus on memory growth. It is possible to
reset the DFA's state by removal or addition of a single pattern. For
observability, the function :zeek:see:`table_pattern_matcher_stats` can
be used. It returns a :zeek:see:`MatcherStats` record with details about
the DFA's state.


Additional operations
^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -1250,6 +1310,11 @@ identifier between vertical pipe characters:
|s|


The :ref:`table's special lookups <table-special-lookups>` extend to the
set ``in`` operator: Using ``in`` with ``addr`` and ``set[subnet]``
or ``string`` and ``set[pattern]`` yields ``T`` if any of the subnets
or patterns the set holds contain or match the given value.

.. zeek:native-type:: vector

vector
Expand Down