Siembol Enrichment Service

1. Overview
- 1.1 Enrichment rule
  - 1.1.1 Matchers
  - 1.1.2 Table Mapping
2. Admin config

Overview

Siembol Enrichment is an enrichment engine used to add useful data to events to assist in detection and investigations.

The data that is used to enrich events is stored in JSON files in a file store in the following format:

{ 
    "key" :
    {
        "column1":"value",
         "column2":"value2",
         ...
    }
}

When creating a rule you can specify the table to use, the column to join on, and the column to add to the event.

Enrichment rule

rule_name - Rule name that uniquely identifies the rule
rule_author - The author of the rule, i.e., the user who last modified the rule
rule_version - The version of the rule
rule_description - This field contains a single text input that allows you set a description for the rule. This should be a short, helpful comment that allows anyone to identify the purpose of this rule
source_type - This fields allows you to determine the type of data you want to match on. It is essentially a matcher for the source_type field. This field does not support regex - however, using * as an input matches all source types. The source_type field is set during parsing and is equal to the name of the last parser which was used to parse the log
matchers - Matchers allow you to further filter the events that the enrichment will be applied to
table_mapping - Mappings for enriching events

Matchers

Matchers allow you to further filter the events that the enrichment will be applied to. You can add as many matchers as you want.

is_enabled - The matcher is enabled
description - The description of the matcher
matcher_type - Type of matcher, either REGEX_MATCH or IS_IN_SET
is_negated- The matcher is negated
field - The name of the field on which the matcher will be evaluated

There are two types of matchers:

REGEX_MATCH - A regex_match allows you use a regex statement to match a specified field. There are two string inputs:
- data - The regex statement in Java syntax https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html except allowing to use underscores in the names of captured groups Named capture groups in the regex are added as fields in the event. They are available from the next matcher onwards and are included in the output event
IS_IN_SET - It compares the value of a field to a set of strings defined in data. If the value is in the set then the matcher returns true.
- data - A list of strings to compare the value to. New line delimited. Does not support regex - each line must be a literal match however, field substitution is supported in this field

Table Mapping

The table mapping tab is where you configure the enrichment you want to perform.

table_name - The name of the table which contains the data you want to enrich the event with
joining_key - The string used to join the event with the table (the key json field). This field supports substitution e.g. ${field_name} or http://${host_field_name}/${path_field_name}. This is used to filter the key field of the table
tags- Tags are added into the event after successful joining the table with the joining key. You can add as many tags as you want
- tag_name - The name of the tag
- tag_value - The value of the tag
enriching_fields - Fields from the enriching table that are added after successful joining the table with the joining key. You can add as many enriching fields as you want
- table_field_name - The column in the enrichment table that you want to add
- event_field_name - The name you want the field to have in event after enriching

Note: you can only enrich from one table per rule. If you want to enrich the same event from multiple table, you need to create multiple rules.

Admin config

topology.name- The name of storm topology
kafka.spout.num.executors - The number of executors for kafka spout
enriching.engine.bolt.num.executors - The number of executors for enriching rule engine
memory.enriching.bolt.num.executors - The number of executors for memory enrichments from tables
merging.bolt.num.executors - The number of executors for merging enriched fields
kafka.writer.bolt.num.executors - The number of executors for producing output messages
enriching.rules.zookeeper.attributes - The zookeeper attributes for updating enrichment rules
- zk.url - Zookeeper servers url. Multiple servers are separated by comma
- zk.path - Path to a zookeeper node
enriching.tables.zookeeper.attributes - The zookeeper attributes for notifying the update of enrichment tables
- zk.url - Zookeeper servers url. Multiple servers are separated by comma
- zk.path - Path to a zookeeper node
kafka.batch.writer.attributes - Kafka batch writer attributes for producing output messages
- producer.properties - Defines kafka producer properties, see https://kafka.apache.org/0102/documentation.html#producerconfigs
storm.attributes - Storm attributes for the enrichment topology
bootstrap.servers - Kafka brokers servers url. Multiple servers are separated by comma
- first.pool.offset.strategy - Defines how the kafka spout seeks the offset to be used in the first poll to kafka
- kafka.spout.properties - Defines kafka consumer attributes for kafka spout such as group.id, protocol, see https://kafka.apache.org/0102/documentation.html#consumerconfigs
- poll.timeout.ms- Kafka consumer parameter poll.timeout.ms used in kafka spout
- offset.commit.period.ms - Specifies the period of time (in milliseconds) after which the spout commits to Kafka, see https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/storm-moving-data/content/tuning_kafkaspout_performance.html
- max.uncommitted.offsets- Defines the maximum number of polled offsets (records) that can be pending commit before another poll can take place
- storm.config - Defines storm attributes for a topology, see https://storm.apache.org/releases/current/Configuration.html
enriching.input.topics- The list of kafka input topics for reading messages
enriching.output.topic - Output kafka topic name for correctly processed messages
enriching.error.topic - Output kafka topic name for error messages
enriching.tables.hdfs.uri - The url for hdfs cluster where enriching tables are stored

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

siembol_enrichment_service.md

siembol_enrichment_service.md

Siembol Enrichment Service

Overview

Enrichment rule

Matchers

Table Mapping

Admin config

Files

siembol_enrichment_service.md

Latest commit

History

siembol_enrichment_service.md

File metadata and controls

Siembol Enrichment Service

Overview

Enrichment rule

Matchers

Table Mapping

Admin config