-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alert on failed to write data into buffer by buffer overflow action=:block #104
Comments
I'm not sure you want know overflow chunk limit size by sending a very large message or exceeding queue limits by slow flushing. |
We use below in buffer plugin : overflow_action drop_oldest_chunk (https://docs.fluentd.org/configuration/buffer-section ). |
v1.0 has still the same problem. |
I am facing the same issue. Any workaround ? |
From Fluentd documentation, using
ref: https://docs.fluentd.org/configuration/buffer-section#flushing-parameters Using |
We run pretty much everything on kubernetes/prometheus/ fluentd/elasticsearch and currently using we are
which seems to include this plugin.
We sometimes have bursts of logs in our environments basically, something spamming the logs and it causes the fluentd output plugin that sends logs over to elasticsearch to block given our overflow_action is block (we do not want drop or throw_exception as we do not log loss). However, when buffers get full basically there is no more logs sent over and since the situation needs to be fixed by fixing the log spammer, we cannot hope to solve this by increasing the following values:
I do not see any way using this plugin to monitor to this particular scenario as the block is not complete there is some logs moving through so cannot rely on the counter of outgoing logs(fluentd_output_status_num_records_total) . I also cannot rely on buffer size as it seems gauge fluctuates a lot and the number of errors does not reflect this (which is just a warning strangely enough).
However, this situation has caused us a couple of times to miss logs for days and having to manually remove big logs and it is quite a headache.
Am I missing some thing on how to alert on this scenario using this plugin?
The text was updated successfully, but these errors were encountered: