Skip to content

Commit

Permalink
[7.17](backport #37022) Clarify the role of flush.min_events and bulk…
Browse files Browse the repository at this point in the history
…_max_size in docs (#37044)

* Clarify the role of flush.min_events and bulk_max_size in docs (#37022)

* Clarify the role of flush.min_events and bulk_max_size

* Clarify that min_events is only max batch when > 1.

* Fix another typo.

* Improve flush.min_events parameter documentation.

(cherry picked from commit 0f8fc26)

# Conflicts:
#	libbeat/outputs/elasticsearch/docs/elasticsearch.asciidoc

* Resolve conflict

---------

Co-authored-by: Craig MacKenzie <[email protected]>
  • Loading branch information
mergify[bot] and cmacknz authored Nov 6, 2023
1 parent ccd93f2 commit 07621a9
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 25 deletions.
41 changes: 24 additions & 17 deletions libbeat/docs/queueconfig.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -28,20 +28,24 @@ queue.mem:

The memory queue keeps all events in memory.

If no flush interval and no number of events to flush is configured,
all events published to this queue will be directly consumed by the outputs.
To enforce spooling in the queue, set the `flush.min_events` and `flush.timeout` options.

By default `flush.min_events` is set to 2048 and `flush.timeout` is set to 1s.

The output's `bulk_max_size` setting limits the number of events being processed at once.

The memory queue waits for the output to acknowledge or drop events. If
the queue is full, no new events can be inserted into the memory queue. Only
after the signal from the output will the queue free up space for more events to be accepted.

This sample configuration forwards events to the output if 512 events are
available or the oldest available event has been waiting for 5s in the queue:
The memory queue is controlled by the parameters `flush.min_events` and `flush.timeout`. If
`flush.timeout` is `0s` or `flush.min_events` is `0` or `1` then events can be sent by the output as
soon as they are available. If the output supports a `bulk_max_size` parameter it controls the
maximum batch size that can be sent.

If `flush.min_events` is greater than `1` and `flush.timeout` is greater than `0s`, events will only
be sent to the output when the queue contains at least `flush.min_events` events or the
`flush.timeout` period has expired. In this mode the maximum size batch that that can be sent by the
output is `flush.min_events`. If the output supports a `bulk_max_size` parameter, values of
`bulk_max_size` greater than `flush.min_events` have no effect. The value of `flush.min_events`
should be evenly divisible by `bulk_max_size` to avoid sending partial batches to the output.

This sample configuration forwards events to the output if 512 events are available or the oldest
available event has been waiting for 5s in the queue:

[source,yaml]
------------------------------------------------------------------------------
Expand All @@ -52,31 +56,34 @@ queue.mem:
------------------------------------------------------------------------------

[float]
==== Configuration options
=== Configuration options

You can specify the following options in the `queue.mem` section of the +{beatname_lc}.yml+ config file:

[float]
===== `events`

Number of events the queue can store.
Number of events the queue can store. This value should be evenly divisible by `flush.min_events` to
avoid sending partial batches to the output.

The default value is 4096 events.

[float]
===== `flush.min_events`

Minimum number of events required for publishing. If this value is set to 0, the
output can start publishing events without additional waiting times. Otherwise
the output has to wait for more events to become available.
Minimum number of events required for publishing. If this value is set to 0 or 1, events are
available to the output immediately. If this value is greater than 1 the output must wait for the
queue to accumulate this minimum number of events or for `flush.timeout` to expire before
publishing. When greater than `1` this value also defines the maximum possible batch that can be
sent by the output.

The default value is 2048.

[float]
===== `flush.timeout`

Maximum wait time for `flush.min_events` to be fulfilled. If set to 0s, events
will be immediately available for consumption.
Maximum wait time for `flush.min_events` to be fulfilled. If set to 0s, events are available to the
output immediately.

The default value is 1s.

Expand Down
6 changes: 4 additions & 2 deletions libbeat/outputs/elasticsearch/docs/elasticsearch.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -637,8 +637,10 @@ endif::[]

The maximum number of events to bulk in a single Elasticsearch bulk API index request. The default is 50.

Events can be collected into batches. {beatname_uc} will split batches larger than `bulk_max_size`
into multiple batches.
Events can be collected into batches. When using the memory queue with `queue.mem.flush.min_events`
set to a value greater than `1`, the maximum batch is is the value of `queue.mem.flush.min_events`.
{beatname_uc} will split batches read from the queue which are larger than `bulk_max_size` into
multiple batches.

Specifying a larger batch size can improve performance by lowering the overhead of sending events.
However big batch sizes can also increase processing times, which might result in
Expand Down
7 changes: 4 additions & 3 deletions libbeat/outputs/logstash/docs/logstash.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -354,9 +354,10 @@ endif::[]

The maximum number of events to bulk in a single {ls} request. The default is 2048.

If the Beat sends single events, the events are collected into batches. If the Beat publishes
a large batch of events (larger than the value specified by `bulk_max_size`), the batch is
split.
Events can be collected into batches. When using the memory queue with `queue.mem.flush.min_events`
set to a value greater than `1`, the maximum batch is is the value of `queue.mem.flush.min_events`.
{beatname_uc} will split batches read from the queue which are larger than `bulk_max_size` into
multiple batches.

Specifying a larger batch size can improve performance by lowering the overhead of sending events.
However big batch sizes can also increase processing times, which might result in
Expand Down
7 changes: 4 additions & 3 deletions libbeat/outputs/redis/docs/redis.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -214,9 +214,10 @@ endif::[]

The maximum number of events to bulk in a single Redis request or pipeline. The default is 2048.

If the Beat sends single events, the events are collected into batches. If the
Beat publishes a large batch of events (larger than the value specified by
`bulk_max_size`), the batch is split.
Events can be collected into batches. When using the memory queue with `queue.mem.flush.min_events`
set to a value greater than `1`, the maximum batch is is the value of `queue.mem.flush.min_events`.
{beatname_uc} will split batches read from the queue which are larger than `bulk_max_size` into
multiple batches.

Specifying a larger batch size can improve performance by lowering the overhead
of sending events. However big batch sizes can also increase processing times,
Expand Down

0 comments on commit 07621a9

Please sign in to comment.