From b5a94e7f0f97271f13fff6719f5525f1a87fac7b Mon Sep 17 00:00:00 2001 From: Christian Kreibich Date: Fri, 26 Jul 2024 00:40:31 -0700 Subject: [PATCH] Add a chapter on script optimization The goal here is to make regular users aware of the feature at this time, but leave most of the tunables and flags out of the picture for the time being. --- scripting/index.rst | 1 + scripting/optimization.rst | 89 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+) create mode 100644 scripting/optimization.rst diff --git a/scripting/index.rst b/scripting/index.rst index 32b74d9c6..26f500bad 100644 --- a/scripting/index.rst +++ b/scripting/index.rst @@ -9,4 +9,5 @@ Introduction to Scripting basics usage event-groups + optimization javascript diff --git a/scripting/optimization.rst b/scripting/optimization.rst new file mode 100644 index 000000000..41e94311a --- /dev/null +++ b/scripting/optimization.rst @@ -0,0 +1,89 @@ +.. _zam: + +=================== +Script Optimization +=================== + +.. versionadded:: 7.0 + +.. note:: + + ZAM has been available in Zeek for a number of releases, but as of Zeek 7 + it has matured to a point where we encourage regular users to explore it. + +Introduction +============ + +The `Zeek Abstract Machine`, "ZAM", is an optional script optimization engine +built into Zeek. Using ZAM changes the basic execution model for Zeek scripts in +an effort to gain higher performance. Normally, Zeek parses scripts into +abstract syntax trees that it then executes by recursively interpreting each +node in a given tree. With ZAM's script optimization, Zeek first compiles the +trees into a low-level form that it can then generally execute more efficiently. + +To enable this feature, include ``-O ZAM`` on the command line. + +How much faster will your scripts run? There's no simple answer to that. It +depends heavily on several factors: + +* What proportion of the processing during execution is spent in the Zeek core's + event engine, rather than executing scripts. ZAM optimization doesn't help + with event engine execution. + +* What proportion of the script's processing is spent executing built-in + functions (BiFs), i.e., functions callable from the script layer but + implemented in native code. ZAM optimization improves execution for some + select, simple BiFs, like :zeek:id:`network_time`, but it doesn't help for + complex ones. It might well be that most of your script processing actually + occurs in the underpinnings of the :ref:`logging framework + `, for example, and thus you won't see much improvement. + +* Those two factors add up to gains very often on the order of only 10-15%, + rather than something a lot more dramatic. + +.. note:: + + At startup, ZAM takes a few seconds to generate the low-level code for the + loaded set of scripts, unless you're using Zeek's `bare mode` (via the + ``-b`` command-line option), which loads only a minimal set of scripts. Keep + this in mind when comparing Zeek runtimes, to ensure you're comparing only + actual script execution time. + +To factor out the code-generation phase, you can for example measure the time +between :zeek:id:`zeek_init` and :zeek:id:`zeek_done` event handlers: + +.. code-block:: zeek + :caption: runtime.zeek + + global t0: time; + + event zeek_init() + { + t0 = current_time(); + } + + event zeek_done() + { + print current_time() - t0; + } + +Here's a quick example of ZAM's effect on Zeek's typical processing of a larger +packet capture, from one of our testsuites: + +.. code-block:: sh + + $ zcat 2009-M57-day11-18.trace.gz | zeek -r - runtime.zeek + 14.0 secs 252.0 msecs 107.858658 usecs + $ zcat 2009-M57-day11-18.trace.gz | zeek -O ZAM -r - runtime.zeek + 12.0 secs 345.0 msecs 857.990265 usecs + +A roughly 13% improvement in runtime. + +Other Optimization Features +=========================== + +You can tune various features of ZAM via additional options to ``-O``, see the +output of ``zeek -O help`` for details. For example, you can study the script +transformations ZAM applies, and use ZAM selectively in certain files (via +``--optimize-files``) or functions (via ``--optimize-funcs``). Most users +won't need to use these.