Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes_logs source permits arbitrarily large lines due to interaction of auto_partial_merge and max_line_bytes #22581

Open
ganelo opened this issue Mar 3, 2025 · 1 comment · May be fixed by #22582
Labels
type: feature A value-adding code addition that introduce new functionality.

Comments

@ganelo
Copy link
Contributor

ganelo commented Mar 3, 2025

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

When using vector to collect logs using a kubernetes_logs source downstream of eg crio, which splits loglines, setting auto_partial_merge to true results in max_line_bytes being essentially ignored.

Attempted Solutions

Consider the following scenarios:

  1. max_line_bytes = 1 MiB
    lines split by crio into 2.5 MiB chunks
    auto_partial_merge = N/A (due to Vector stopping before reaching the continuation character)
    result: lines greater than 1 MiB always dropped, including all lines that were split by crio
  2. max_line_bytes = 3 MiB
    lines split by crio into 2.5 MiB chunks
    auto_partial_merge = true
    result: no lines ever dropped (due to max_line_bytes being applied before merging -> all partial lines are automatically below 3 MiB limit since they're split into 2.5 MiB chunks)

Proposal

It would be nice if it were possible to specify an additional configuration limit for line size to be applied after merging to protect downstream pipeline/consumers from huge lines. That would allow simultaneously benefitting from the auto_partial_merge feature without allowing arbitrarily large lines into the pipeline.

Another option would be to change the behavior of max_line_bytes when auto_partial_merge is set to true, but it might be better for backcompat reasons to avoid changing the behavior of an existing config field.

References

No response

Version

0.45.0

@ganelo ganelo added the type: feature A value-adding code addition that introduce new functionality. label Mar 3, 2025
@ganelo
Copy link
Contributor Author

ganelo commented Mar 3, 2025

Have a proposed fix in this PR: #22582

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
1 participant