Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: truncate / over-write files rather than append - configurable option #22308

Open
johnhtodd opened this issue Jan 27, 2025 · 2 comments
Labels
sink: file Anything `file` sink related type: feature A value-adding code addition that introduce new functionality.

Comments

@johnhtodd
Copy link

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

We frequently want to read files that are outputs of Vector, though we have aggregations that are once per (minute, hour, multiple hours) in duration. Currently, we use a shell script to grab a wildcard version of the filename, rename it or move it, and put it somewhere so that other tools don't have to parse wildcards or deal with multiple names. It seems like this could easily be worked around with Vector if there was some way to truncate (over-write) a file. I dislike having to run shell scripts to do "clean-up" on concepts that I think could be done by the originating application instead.

Attempted Solutions

I've used shell scripts and crontabs and logrotate. Ugh.

Proposal

The idea: add two features to the "file" sink.

  1. truncate_after_closetime: - this would over-write the contents of the file, assuming the name was the same at the moment of decision as when the file was created, if the file had been closed for more than a certain number of seconds. This would work in conjunction with idle_timeout_secs which would close the file after a certain number of seconds.

  2. truncate_after_modifiedtime: - this would over-write the contents of a file, assuming the name was the same at the moment of decision as when the file was created, if the file modification was more than a certain number of seconds in the past. If the file was open, close it before truncating. This is very similar to truncate_after_closetime, but slightly different. It may be that only one of the two options is needed to start.

References

No response

Version

vector 0.44.0 (x86_64-unknown-linux-gnu)

@johnhtodd johnhtodd added the type: feature A value-adding code addition that introduce new functionality. label Jan 27, 2025
@pront pront added the sink: file Anything `file` sink related label Jan 28, 2025
@pront
Copy link
Member

pront commented Jan 28, 2025

Hi @johnhtodd, this sounds like a reasonable extension to the file sink. From a UX perspective, it's preferable to have a set of truncation rules (assuming you might want to set both at the same time).

@johnhtodd
Copy link
Author

It occurs to me I've missed a very obvious use case, and that is truncating the file after a fixed number of seconds, regardless of status of the file. I'd like to add:

  1. truncate_timer: this would over-write the contents of a file, assuming the name was the same at the moment of decision as when the file was created. If the file was open at the expiration of the timer, close it before truncating. Integer in seconds.

As always, the configuration naming syntax I've used may not be obvious or consistent and I welcome sanity checking. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sink: file Anything `file` sink related type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

No branches or pull requests

2 participants