storage: file trimming improvement #7852

leonardo-albertovich · 2023-08-22T14:54:40Z

This PR adds a new option named storage.trim_files which can be used to control the file trimming behavior in chunkio.

This is necessary in order to address an issue where excessive file fragmentation was caused by over zealously trimming chunk files in certain XFS deployments.

The default behavior for storage.trim_files is off which is a deviation from the previous default behavior which is something we might want to be mindful of.

Signed-off-by: Leonardo Alminana <[email protected]>

stewartsmith · 2023-08-25T17:27:05Z

lib/chunkio/src/cio_file.c

-                          " %s/%s", ch->st->name, ch->name);
+    /* File trimming has been made opt-in because it causes
+     * performance degradation and excessive fragmentation
+     * in XFS.


Specifically here, it's because the behavior of the cio code was telling the file system very conflicting things. It was saying that a file was going to be X MB size, and then saying it was going to be Y bytes (log message length) in size... which leads the file system to make allocation decisions based on those bits of information. Decisions like "oh, this file isn't going to grow, I can put this other thing here", which is an assumption that's invalidated as soon as the log file grows. Given an appropriate loop over an appropriate amount of time, the fragmentation starts to hurt.

I'm not sure what file system would be have well with the previous behavior of "allocate to X MB, truncate to Y bytes, keep appending with an explicit truncate until you get to X MB", but I doubt it's behaving well on purpose. i.e. I don't know why you would keep around this behavior even as an option.

I'm not super familiar with fluent-bit and cio, but if this code is now ending up going "oh, the config says this file will be 2MB, so just allocate 2MB and don't truncate it down unless we are 100% sure it will not grow", then I think that will give the file system allocator the best opportunity to make good decisions.

If it is doing that, and you reserve enough space at the end of a log file to be able to write a log message of "oh no, out of disk space", then you probably have a nice "graceful" way to deal with ENOSPC conditions. (well, copy-on-write and thin provisioning is a whole other thing, but from what I understand here, that will be pretty unlikely for any individual file).

Fluent-bit doesn't pre-allocate the expected file size yet (that's one opt-in improvement we're considering for the current version). Instead, when cio detects that the file is not large enough to fit the contents it wants to append it grows the file in 8 * page_size (wihch in most cases means 32kb) increments until it reaches the required size.

With the current chunk file size limit implementation in fluent-bit pre-allocating files doesn't guarantee that we won't have to increase their size because the limit is imposed after the contents are appended but that could be easily fixed.

Thanks for taking a look at this, if you have any questions or remarks please let me know, I'd be glad to go over it with you to be sure that there are no corner cases that weren't addressed,

RicardoAAD · 2023-08-29T14:50:10Z

@leonardo-albertovich, The issue is no longer reproducible with this Test Branch.

RicardoAAD · 2023-09-01T21:58:17Z

Hello @leonardo-albertovich

When testing this fix with option storage.checksum enabled, we see several chunks with "format check failed" error.

[2023/09/01 21:29:15] [error] [storage] format check failed: tail.0/10413-1693603751.284017178.flb
[2023/09/01 21:29:15] [error] [storage] format check failed: tail.0/10413-1693603751.364724849.flb
[2023/09/01 21:29:15] [error] [storage] format check failed: tail.0/10413-1693603751.481694472.flb

FLuent Bit ran for 5 minutes and got 862 of these errors.

$ grep "format check failed"  test-log.log | wc -l
862

[SERVICE]
    grace                      0
    flush                      1
    log_level                  info
    log_file                   ./test-log.log
    http_server                off
    storage.path               /mnt/test-disk/
    storage.trim_files         true
    storage.checksum on
    storage.max_chunks_up      1
[INPUT]
    refresh_interval 1
    name                       tail
    read_from_head             on
    path                       logs/*.log
    storage.type               filesystem
    buffer_chunk_size          2M
    buffer_max_size            2M
    tag                        <fn>
    tag_regex                  (?<fn>.*)
[FILTER]
    Name modify
    Match *
    Add Service1 SOMEVALUE
    Add Service3 SOMEVALUE3
[OUTPUT]
    name                       http
    match                      *
    format                     json_lines
    host                       127.0.0.1
    port                       8443
    retry_limit                False
    tls                        on
    tls.verify                 off
    workers                    1
    storage.total_limit_size   9100M

Could you please advise if this is expected with this option enabled?

Thanks.

lecaros · 2023-09-04T15:33:25Z

ping @edsiper @leonardo-albertovich

leonardo-albertovich · 2023-09-04T15:38:03Z

@RicardoAAD this should not happen and I have not observed this previously. Please share a copy of those corrupted files with me so I can take a look at them. Meanwhile I'll try to reproduce the issue.

leonardo-albertovich · 2023-09-04T16:02:40Z

@RicardoAAD I have been running fluent-bit for 10 minutes and didn't see that error once so I think I'll need some input from your side.

RicardoAAD · 2023-09-04T20:43:17Z

Thanks @leonardo-albertovich, Please let us know if you need any additional information from the repro that we showed you today.

Regards,

This bug had already been fixed in upstream, the round operation causes the mapping size (and thus alloc size) to be larger than the file size which means there is a memory area that's not backed by the file which in some cases such as XFS causes data loss. Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich · 2023-09-06T14:25:39Z

@RicardoAAD could you please re-test?

RicardoAAD · 2023-09-11T13:58:27Z

Hi @leonardo-albertovich, I tested the new changes, and there are no issues with the checksum now.

Thanks.

leonardo-albertovich added 3 commits August 22, 2023 16:49

lib: chunkio: made file trimming optional

8f9c00d

Signed-off-by: Leonardo Alminana <[email protected]>

storage: added an option to control the file trimming behavior

161a9b0

Signed-off-by: Leonardo Alminana <[email protected]>

input_chunk: removed unnecessary chunk size limit

7504c24

Signed-off-by: Leonardo Alminana <[email protected]>

leonardo-albertovich requested review from edsiper, fujimotos and koleini as code owners August 22, 2023 14:54

github-actions bot added the docs-required label Aug 22, 2023

leonardo-albertovich added 2 commits August 23, 2023 13:54

lib: chunkio: added content length validation

c2e514b

Signed-off-by: Leonardo Alminana <[email protected]>

lib: chunkio: fixed checksum overwrite issue

ac9dd07

Signed-off-by: Leonardo Alminana <[email protected]>

lecaros mentioned this pull request Aug 25, 2023

Excessive Freespace Fragmentation on XFS #7034

Closed

stewartsmith reviewed Aug 25, 2023

View reviewed changes

edsiper merged commit 4c7f71a into tiger-1.8.15 Sep 12, 2023
3 checks passed

edsiper deleted the leonardo-tiger-file-trimming-improvement branch September 12, 2023 21:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: file trimming improvement #7852

storage: file trimming improvement #7852

leonardo-albertovich commented Aug 22, 2023

stewartsmith Aug 25, 2023

leonardo-albertovich Aug 26, 2023

RicardoAAD commented Aug 29, 2023

RicardoAAD commented Sep 1, 2023

lecaros commented Sep 4, 2023

leonardo-albertovich commented Sep 4, 2023

leonardo-albertovich commented Sep 4, 2023

RicardoAAD commented Sep 4, 2023

leonardo-albertovich commented Sep 6, 2023

RicardoAAD commented Sep 11, 2023

storage: file trimming improvement #7852

storage: file trimming improvement #7852

Conversation

leonardo-albertovich commented Aug 22, 2023

stewartsmith Aug 25, 2023

Choose a reason for hiding this comment

leonardo-albertovich Aug 26, 2023

Choose a reason for hiding this comment

RicardoAAD commented Aug 29, 2023

RicardoAAD commented Sep 1, 2023

lecaros commented Sep 4, 2023

leonardo-albertovich commented Sep 4, 2023

leonardo-albertovich commented Sep 4, 2023

RicardoAAD commented Sep 4, 2023

leonardo-albertovich commented Sep 6, 2023

RicardoAAD commented Sep 11, 2023