Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update "Custom Brimcap Config" wiki article #340

Merged
merged 3 commits into from
Apr 16, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 43 additions & 70 deletions docs/Custom-Brimcap-Config.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,8 @@ choosing. This article includes examples of both configurations along with

The goal in our first example customization will be to run Zui with the latest
GA binary releases of [Zeek](https://github.com/zeek/zeek/wiki/Binary-Packages)
and [Suricata](https://suricata.readthedocs.io/en/latest/install.html#install-binary-packages),
as these are newer than the versions that currently ship with Brimcap. We'll
use Linux Ubuntu 20.04 as our OS platform. On such a host, the following
and [Suricata](https://suricata.readthedocs.io/en/latest/install.html#install-binary-packages).
We'll use Linux Ubuntu 20.04 as our OS platform. On such a host, the following
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, we can't say that GA releases of both Zeek and Suricata are newer than the ones with Brimcap, since we're now up-to-date on Zeek. Therefore I dropped that part of the text.

philrz marked this conversation as resolved.
Show resolved Hide resolved
commands install these from common repositories.

```
Expand All @@ -53,12 +52,17 @@ additional customizations.

1. The Brimcap-bundled Zeek includes the additional packages
[geoip-conn](https://github.com/brimdata/geoip-conn),
[zeek-community-id](https://github.com/corelight/zeek-community-id),
[HASSH](https://github.com/salesforce/hassh),
and [JA3](https://github.com/salesforce/ja3). These would typically be
installed via [Zeek Package Manager](https://docs.zeek.org/projects/package-manager/en/stable/quickstart.html).

2. The Brimcap-bundled Suricata includes a
2. Other changes are made to the default configuration of the Brimcap-bundled
Zeek, such as enabling
philrz marked this conversation as resolved.
Show resolved Hide resolved
[Community ID Flow Hashing](https://docs.zeek.org/en/master/customizations.html#community-id).
See the [build-zeek release automation](https://github.com/brimdata/build-zeek/blob/main/.github/workflows/release.yml)
for details on how this and other customizations are handled.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to use a Zeek package for Community ID, but now it's included out-of-the-box with Zeek so I've been taking advantage of that in the Zeek builds that come out of the new build-zeek repo. This gave me a nice excuse to link to the build-zeek repo, since it's a better starting place than what we had before for users that want to try their hand at making their own custom Zeek builds, including on Windows.


3. The Brimcap-bundled Suricata includes a
[YAML configuration](https://github.com/brimdata/build-suricata/blob/master/brim-conf.yaml)
that (among other things) enables the `community_id` field, which is essential
for joining to the `community_id` field in Zeek events to give context
Expand All @@ -67,7 +71,7 @@ configuration by setting
[`community-id: yes`](https://github.com/brimdata/build-suricata/blob/853fab6d7c21325f57e113645004b1107b78d840/brim-conf.yaml#L51-L52)
for the `eve-log` output.

3. To ensure [rules](https://suricata.readthedocs.io/en/latest/rules/)
4. To ensure [rules](https://suricata.readthedocs.io/en/latest/rules/)
are kept current, the Zui app invokes the bundled "Suricata Updater" once
each time it is launched. However, in a custom configuration, no attempt is made
to trigger updates on your behalf. You may choose to periodically run your
Expand Down Expand Up @@ -102,11 +106,11 @@ logs and open flows from the pcap via the **Packets** button.
The same combination of `brimcap` and `zed` commands can be used to
incrementally add more logs to the same pool and index for additional pcaps.

The setting in the Zui **Settings** for the **Brimcap YAML Config File**
The Zui **Settings** for the **Brimcap YAML Config File**
can also be pointed at the path to this configuration file, which will cause it
to be invoked when you open or drag pcap files into Zui.

![Zui YAML Config File Preference](media/Zui-Pref-YAML-Config-File.png)
![Zui YAML Config File Setting](media/Zui-Settings-YAML-Config-File.png)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Zui menu used to be called "Preferences" in some OSes and "Settings" on others, but thankfully we've standardized on "Settings" across the board now.


In examining the example Brimcap YAML, we see at the top that we've defined two
`analyzers`.
Expand Down Expand Up @@ -137,12 +141,12 @@ performed on them.
In our example configuration, the first analyzer invoked by Brimcap is a wrapper
script as referenced in the YAML. In addition to reading from its standard input, it also
tells Zeek to ignore checksums (since these are often set incorrectly on pcaps)
and disables a couple of the excess log outputs.
and disables a few of the excess log outputs.

```
$ cat zeek-wrapper.sh
$ cat zeek-wrapper.sh
#!/bin/bash
exec /opt/zeek/bin/zeek -C -r - --exec "event zeek_init() { Log::disable_stream(PacketFilter::LOG); Log::disable_stream(LoadedScripts::LOG); }" local
exec /opt/zeek/bin/zeek -C -r - --exec "event zeek_init() { Log::disable_stream(PacketFilter::LOG); Log::disable_stream(LoadedScripts::LOG); Log::disable_stream(Telemetry::LOG); }" local
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newer Zeek releases have these telemetry logs that are suited primarily to gathering perf info about running deployments (e.g., live capture environments) but they still get generated when processing pcaps. Their volume dwarfs the amount of actual analyzed events when small pcaps are processed, and it seems doubtful their contents would be essential viewing for the pcap use case. I've been excluding them via the Zeek Runners that are bundled with the builds from the new build-zeek repo, so I do the same here.

```

> **Note:** If you intend to point to your custom Brimcap YAML config from
Expand Down Expand Up @@ -180,10 +184,10 @@ analyzers:
...
```

What follows below the `globs:` setting is a Zed shaper script. Whereas the
What follows below the `globs:` setting is a Zed [shaper](https://zed.brimdata.io/docs/language/shaping) script. Whereas the
Zeek TSV logs contain Zed-compatible rich data types (timestamps, IP
addresses, etc.), since Suricata's EVE logs are NDJSON, here we use this shaper
to assign better data types as the NDJSON is being converted for storage
to assign richer data types as the NDJSON is being converted for storage
into the Zed lake. Out-of-the-box, Brimcap automatically applies this same
[shaper script](https://github.com/brimdata/brimcap/blob/main/suricata.zed)
on the EVE output generated from its bundled Suricata.
Expand Down Expand Up @@ -242,8 +246,7 @@ further modify it to suit your needs.
where event_type=="alert" | yield shape(alert) | rename ts := timestamp
```

A full description of all that's possible with
[shapers](https://zed.brimdata.io/docs/language/shaping) is beyond
A full description of all that's possible with shapers is beyond
the scope of this article. However, this script is quite simple and can
be described in brief.

Expand All @@ -258,14 +261,14 @@ alerts. If you want to let through more Suricata data besides just alerts, you
could remove this part of the pipeline. If so, you'll likely want to explore
the additional data and create shapers to apply proper data types to them,
since this will be a prerequisite for doing certain Zed queries with the data
(e.g., a successful CIDR match requires IP addresses to be stored as `ip` type,
(e.g., a successful [CIDR match](https://zed.brimdata.io/docs/language/functions/cidr_match) requires IP addresses to be stored as `ip` type,
not the `string` type in which they'd appear in unshaped NDJSON).

3. The `yield shape(alert)` applies the shape of the `alert` type to each
input record. With what's shown here, additional fields that appear beyond
those specified in the shaper (e.g. as the result of new Suricata features or
your own customizations) will still be let through this pipeline and stored in
the Zed lake. If this is undesirable, add `| yield crop(alert)` downstream
the Zed lake with an inferred data type. If this is undesirable, add `| yield crop(alert)` downstream
of the first `yield`, which will trim these additional fields.

4. The `rename ts := timestamp` changes the name of Suricata's `timestamp`
Expand Down Expand Up @@ -306,12 +309,12 @@ export LD_LIBRARY_PATH="/usr/local/lib"

As we did with Zeek and Suricata, we create a [wrapper script](https://github.com/brimdata/brimcap/blob/main/examples/nfdump-wrapper.sh) to act as our
Brimcap analyzer. It works in two phases, first creating binary NetFlow records
and then converting them to CSV. `nfpcapd` only accepts a true pcap file input
and then converting them to NDJSON. `nfpcapd` only accepts a true pcap file input
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I ran the latest nfdump today for the first time in a while, it seems they recently mucked up their CSV support (they now output one of their fields onto a separate line, which made Zed's CSV reader choke), but the good news is that they added an NDJSON output format that I'm more than happy to recommend instead. I'd only been using nfdump's CSV output because their JSON output was always a giant array in the past which meant bumping into brimdata/super#3865. Now that they can output NDJSON (which they call json-log since it's more compatible with log-centric tools like Splunk and Logstash) we can take advantage of that.

(not a device like `/dev/stdin`), so we first store the incoming pcap in a
temporary file.

```
$ cat nfdump-wrapper.sh
$ cat nfdump-wrapper.sh
#!/bin/bash
export LD_LIBRARY_PATH="/usr/local/lib"
TMPFILE=$(mktemp)
Expand All @@ -320,70 +323,40 @@ cat - > "$TMPFILE"
rm "$TMPFILE"
for file in nfcapd.*
do
/usr/local/bin/nfdump -r $file -o csv | head -n -3 > ${file}.csv
/usr/local/bin/nfdump -r $file -o json-log > ${file}.ndjson
done
```

This script is called from our Brimcap config YAML, which includes a `globs:`
setting to apply a Zed shaper to only the CSV files that were output from
setting to apply a Zed shaper to only the NDJSON files that were output from
nfdump.

```
$ cat nfdump.yml
analyzers:
- cmd: /usr/local/bin/nfdump-wrapper.sh
name: nfdump
globs: ["*.csv"]
globs: ["*.ndjson"]
shaper: |
type netflow = {
ts: time,
te: time,
td: duration,
sa: ip,
da: ip,
sp: uint16,
dp: uint16,
pr: string,
flg: string,
fwd: bytes,
stos: bytes,
ipkt: uint64,
ibyt: uint64,
opkt: uint64,
obyt: uint64,
in: uint64,
out: uint64,
sas: uint64,
das: uint64,
smk: uint8,
dmk: uint8,
dtos: bytes,
dir: uint8,
nh: ip,
nhb: ip,
svln: uint16,
dvln: uint16,
ismc: string,
odmc: string,
idmc: string,
osmc: string,
mpls1: string,
mpls2: string,
mpls3: string,
mpls4: string,
mpls5: string,
mpls6: string,
mpls7: string,
mpls8: string,
mpls9: string,
mpls10: string,
cl: float64,
sl: float64,
al: float64,
ra: ip,
eng: string,
exid: bytes,
tr: time
type: string,
export_sysid: int64,
first: time,
last: time,
received: time,
in_packets: int64,
in_bytes: int64,
proto: int64,
tcp_flags: string,
src_port: uint16,
dst_port: uint16,
fwd_status: int64,
src_tos: int64,
src4_addr: ip,
dst4_addr: ip,
src4_geo: string,
dst4_geo: string,
sampled: int64
}
yield shape(netflow)
```
Expand Down
Binary file modified docs/media/Custom-Zeek-Suricata-Pool.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/media/NetFlow-Pool.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/media/Zui-Pref-YAML-Config-File.png
Binary file not shown.
Binary file added docs/media/Zui-Settings-YAML-Config-File.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion examples/nfdump-wrapper.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ cat - > "$TMPFILE"
rm "$TMPFILE"
for file in nfcapd.*
do
/usr/local/bin/nfdump -r $file -o csv | head -n -3 > ${file}.csv
/usr/local/bin/nfdump -r $file -o json-log > ${file}.ndjson
done
68 changes: 19 additions & 49 deletions examples/nfdump.yml
Original file line number Diff line number Diff line change
@@ -1,56 +1,26 @@
analyzers:
- cmd: /usr/local/bin/nfdump-wrapper.sh
name: nfdump
globs: ["*.csv"]
globs: ["*.ndjson"]
shaper: |
type netflow = {
ts: time,
te: time,
td: duration,
sa: ip,
da: ip,
sp: uint16,
dp: uint16,
pr: string,
flg: string,
fwd: bytes,
stos: bytes,
ipkt: uint64,
ibyt: uint64,
opkt: uint64,
obyt: uint64,
in: uint64,
out: uint64,
sas: uint64,
das: uint64,
smk: uint8,
dmk: uint8,
dtos: bytes,
dir: uint8,
nh: ip,
nhb: ip,
svln: uint16,
dvln: uint16,
ismc: string,
odmc: string,
idmc: string,
osmc: string,
mpls1: string,
mpls2: string,
mpls3: string,
mpls4: string,
mpls5: string,
mpls6: string,
mpls7: string,
mpls8: string,
mpls9: string,
mpls10: string,
cl: float64,
sl: float64,
al: float64,
ra: ip,
eng: string,
exid: bytes,
tr: time
type: string,
export_sysid: int64,
first: time,
last: time,
received: time,
in_packets: int64,
in_bytes: int64,
proto: int64,
tcp_flags: string,
src_port: uint16,
dst_port: uint16,
fwd_status: int64,
src_tos: int64,
src4_addr: ip,
dst4_addr: ip,
src4_geo: string,
dst4_geo: string,
sampled: int64
}
yield shape(netflow)
2 changes: 1 addition & 1 deletion examples/zeek-wrapper.sh
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
#!/bin/bash
exec /opt/zeek/bin/zeek -C -r - --exec "event zeek_init() { Log::disable_stream(PacketFilter::LOG); Log::disable_stream(LoadedScripts::LOG); }" local
exec /opt/zeek/bin/zeek -C -r - --exec "event zeek_init() { Log::disable_stream(PacketFilter::LOG); Log::disable_stream(LoadedScripts::LOG); Log::disable_stream(Telemetry::LOG); }" local
Loading