Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive Freespace Fragmentation on XFS #7034

Closed
tzneal opened this issue Mar 17, 2023 · 6 comments
Closed

Excessive Freespace Fragmentation on XFS #7034

tzneal opened this issue Mar 17, 2023 · 6 comments

Comments

@tzneal
Copy link

tzneal commented Mar 17, 2023

Bug Report

Describe the bug

fluent-bit can cause excessive freespace fragmentation on XFS in some circumstances.

To Reproduce

  1. Run fluent-bit with an output that fails (e.g. below, elasticsearch-master is a non-existent host).
  [SERVICE]
        Daemon Off
        Flush 1
        Log_Level info
        Parsers_File parsers.conf
        Parsers_File custom_parsers.conf
        HTTP_Server On
        HTTP_Listen 0.0.0.0
        HTTP_Port 2020
        Health_Check On
        storage.type filesystem
        storage.path /var/log/flb/
        storage.max_chunks_up 1
        storage.sync normal

    [INPUT]
        Name tail
        Path /var/log/containers/*.log
        multiline.parser docker, cri
        Tag kube.*
        Mem_Buf_Limit 5MB
        Skip_Long_Lines On
        storage.type filesystem
        storage.total_limit_size 100G

    [INPUT]
        Name systemd
        Tag host.*
        Systemd_Filter _SYSTEMD_UNIT=kubelet.service
        Read_From_Tail On
        storage.type filesystem
        storage.total_limit_size 100G

    [FILTER]
        Name kubernetes
        Match kube.*
        Merge_Log On
        Keep_Log Off
        K8S-Logging.Parser On
        K8S-Logging.Exclude On

    [OUTPUT]
        Name es
        Match kube.*
        Host elasticsearch-master
        Logstash_Format On
        Retry_Limit False

    [OUTPUT]
        Name es
        Match host.*
        Host elasticsearch-master
        Logstash_Format On
        Logstash_Prefix node
        Retry_Limit False
  1. Crate logs rapidly so that *.flb files are written to disk.

You would expect the disk to eventually fill up, but well before that occurs the freespace gets so fragmented that XFS can't dynamiclly allocate new inodes and you are unable to create new files on disk.

Expected behavior

The flb files should be written to disk in an un-fragmented way and overall freespace fragmentation should not increase.

Screenshots

You will eventually reach a point where you have both free space and inodes free, but can no longer write files.

Note: inodes are actually the issue, df -i will report the maximum potential number of inodes but they are dynamically allocated and can no longer be created due to no contiguous extent being large enough for a block of indoes.

[root@ip-192-168-41-179 tmp]# df -ah /
Filesystem      Size  Used Avail Use% Mounted on
/dev/nvme0n1p1   80G   41G   40G  51% /

[root@ip-192-168-41-179 tmp]# touch newfile
touch: cannot touch 'newfile': No space left on device


[root@ip-192-168-41-179 tmp]# xfs_db -c 'freesp -s' -r /dev/nvme0n1p1
   from      to extents  blocks    pct
      1       1   75224   75224   0.71
      2       3  207013  542331   5.12
      4       7 1682926 9974596  94.17
total free extents 1965163
total free blocks 10592151
average free extent size 5.38996

If the disk didn't have excessive free space fragmentation, the distribution of free extents would look more like this with the majority of space being in large contiguous chunks:

$  sudo xfs_db -r -c 'freesp -s ' /dev/nvme0n1p1
   from      to extents  blocks    pct
      1       1     201     201   0.00
      2       3      24      51   0.00
      4       7       6      29   0.00
      8      15       7      92   0.00
     16      31      15     322   0.01
     32      63      14     621   0.01
     64     127      12    1112   0.02
    128     255       9    1828   0.04
    256     511      12    4285   0.10
    512    1023       8    5080   0.11
   1024    2047       5    7446   0.17
   2048    4095       9   28400   0.64
   4096    8191       5   26004   0.58
   8192   16383       6   80576   1.81
  16384   32767       3   65961   1.48
  32768   65535       2  115542   2.59
  65536  130943      35 4118632  92.43
total free extents 373
total free blocks 4456182
average free extent size 11946.9

Your Environment

  • Version used: Multiple, verified with 1.8.15 and 2.0.9
  • Configuration: Above
  • Environment name and version (e.g. Kubernetes? What version?): 1.24
  • Server type and version:
  • Operating System and version: AL2
  • Filters and plugins:

Additional context

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Jun 16, 2023
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 22, 2023
@lecaros
Copy link
Contributor

lecaros commented Aug 25, 2023

Hi @tzneal,
can you try this PR in your test case?

#7852

@lecaros lecaros reopened this Aug 25, 2023
@github-actions github-actions bot removed the Stale label Aug 26, 2023
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Nov 24, 2023
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 29, 2023
@lecaros
Copy link
Contributor

lecaros commented Nov 29, 2023

This was improved with #7852

@lecaros lecaros closed this as completed Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants