Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Elastic module documentation #791

Merged
merged 6 commits into from
Nov 11, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
124 changes: 94 additions & 30 deletions doc/modules/elastic.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,44 +5,108 @@ title: Elasticsearch module

# Elasticsearch module

Elasticsearch module pushes a variety of message-related metadata to an instance of [Elasticsearch](https://elastic.co/). This module provides exporter, template creation logic and a simple Kibana dashboard.
Elasticsearch module pushes a variety of message-related metadata to an instance of [Elasticsearch](https://elastic.co/) or [OpenSearch](https://opensearch.org/).

<img src="https://i.imgur.com/etYWT8R.png" class="img-fluid" />

This plugin is based on the [plugin](https://github.com/Menta2k/rspamd-elastic) created by [Veselin Iordanov](https://github.com/Menta2k) and adopted for the Elasticsearch 6.x
Additionally module manages index template & policy and ingest pipeline for geoip functionality.

## Requirements
- [Elasticsearch 6.x](https://www.elastic.co/) - Indexing database
- [ingest-geoip](https://www.elastic.co/guide/en/elasticsearch/plugins/master/ingest-geoip.html) - Elasticsearch plugin used for geoip resolve
- [Kibana](https://www.elastic.co/products/kibana) (optional) - Used for data visualization
- Supported version of [Elasticsearch](https://www.elastic.co/) or [OpenSearch](https://opensearch.org/) - Indexing database
- [Kibana](https://www.elastic.co/products/kibana) or [OpenSearch Dashboards](https://opensearch.org/) (optional) - Used for data visualization

## Configuration

Configuration is fairly simple:
Starting from version Rspamd 3.11.0 module is disabled by default and should be explicitly `enabled` via `local.d/elastic.conf` or `override.d/elastic.conf`.

*Important:* by default module configures `index_policy` to delete logs older then 30 days.
If you are updating from version 3.10.x or older and want to use a different index policy - please configure it before enabling this module.

By default, the module automatically detects the distribution and whether the server version is supported, this behaviour can be disabled by setting `autodetect_enabled` to `false`, then it will take version of distribution from configuration.

Automatic index template managment as well as index policy and geoip pipeline can be turned off by setting `managed` to `false` in corresponding config section.

If you want to use your own existing index policy but keep a managed index template - you can set index policy `managed` to `false` and change the `name` of the policy to your custom one.

If you don't want to use index policy at all you need disable it by setting `enabled` to `false` in corresponding config section, same applies for geoip.

~~~hcl
# local.d/elastic.conf
# Push update when 10 records are collected (10 if unset)
limit = 10;
# IP:port of Elasticsearch server
enabled = true;
server = "localhost:9200";
# Timeout to wait for response (5 seconds if unset)
timeout = 5;
# Elasticsearch template file (json format)
#template_file = "${PLUGINSDIR}/elastic/rspamd_template.json";
# Kibana prebuild visualizations and dashboard template (json format)
#kibana_file = "${PLUGINSDIR}/elastic/kibana.json";
# Elasticsearch index name pattern
index_pattern = "rspamd-%Y.%m.%d";
# Import Kibana template
import_kibana = false;
# Use https if needed
use_https = false;
# Ignore certificate warnings (rspamd will lookup the IP-address of a given hostname and connect with the IP-address)
user = "elastic";
password = "elastic";
use_https = true;
periodic_interval = 5.0; # how often try to run background periodic tasks
timeout = 5.0; # how much wait for reply from elastic
no_ssl_verify = false;
# credential to connect to ElasticSearch (optional)
user = "rspamd"
password = "supersecret"
# ingest-geoip is a module (true if ElasticSearch >= 6.7.0)
ingest_module = false;
version = {
autodetect_enabled = true;
autodetect_max_fail = 30;
# override works only if autodetect is disabled
override = {
name = "opensearch";
version = "2.17";
}
};
limits = {
max_rows = 500; # max logs in one bulk req to elastic and first reason to flush buffer to elastic
max_interval = 60; # seconds; if the first log in the buffer is older than this interval, flush the buffer
max_fail = 10;
};
index_template = {
managed = true;
name = "rspamd";
priority = 0;
pattern = "%Y.%m.%d";
shards_count = 3;
replicas_count = 1;
refresh_interval = 5; # seconds
dynamic_keyword_ignore_above = 256;
headers_count_ignore_above = 5; # record only N first same named headers, add "ignored above..." if reached, set 0 to disable limit
headers_text_ignore_above = 2048; # strip specific header value and add "..." to the end; set 0 to disable limit
symbols_nested = false;
empty_value = "unknown"; # empty numbers, ips and ipnets are not customizable they will be always 0, :: and ::/128 respectively
};
index_policy = {
enabled = true;
managed = true;
name = "rspamd"; # if you want use custom lifecycle policy, change name and set managed = false
hot = {
index_priority = 100;
};
warm = {
enabled = true;
after = "2d";
index_priority = 50;
migrate = true; # only supported with elastic distro, will not have impact elsewhere
read_only = true;
change_replicas = false;
replicas_count = 1;
shrink = false;
shards_count = 1;
max_gb_per_shard = 0; # zero - disabled by default, if enabled - shards_count is ignored
force_merge = false;
segments_count = 1;
};
cold = {
enabled = true;
after = "14d";
index_priority = 0;
migrate = true; # only supported with elastic distro, will not have impact elsewhere
read_only = true;
change_replicas = false;
replicas_count = 1;
};
delete = {
enabled = true;
after = "30d";
};
};
# extra headers to collect, f.e.:
# "Precedence";
# "List-Id";
extra_collect_headers = [];
geoip = {
enabled = true;
managed = true;
pipeline_name = "rspamd-geoip";
};
~~~