Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbale to install fluent bit 3.2.2 version , throwing error - cannot open chunk: etc/machine-id #9801

Open
ankitpanwar174 opened this issue Jan 6, 2025 · 1 comment

Comments

@ankitpanwar174
Copy link

ankitpanwar174 commented Jan 6, 2025

Bug Report

Describe the bug
After upgrading the Fluent Bit image from version 2.2.2 to 3.2.2, Fluent Bit fails to process log files properly, throwing a "Permission denied" error when attempting to access certain storage chunks. The error prevents the logging pipeline from continuing, resulting in the pausing of multiple input plugins, and log processing is halted.

Below are the error messages observed during runtime:

[2025/01/06 11:29:14] [ info] [input:storage_backlog:storage_backlog.7] register tail.2/1-1733390141.793171810.flb
[/src/fluent-bit/lib/chunkio/src/cio_file_unix.c:410 errno=13] Permission denied
[2025/01/06 11:29:14] [ info] [input:storage_backlog:storage_backlog.7] register tail.2/1-1733390437.65627129.flb
[2025/01/06 11:29:14] [error] [storage] [cio file] cannot open chunk: etc/machine-id
[2025/01/06 11:29:14] [error] [engine] could not segregate backlog chunks
[2025/01/06 11:29:14] [ info] [input] pausing container_logs
[2025/01/06 11:29:14] [ info] [input] pausing audit_logs
[2025/01/06 11:29:14] [ info] [input] pausing core_kubernetes_logs
[2025/01/06 11:29:14] [ info] [input] pausing core_kubernetes_logs
[2025/01/06 11:29:14] [ info] [input] pausing systemd.4
[2025/01/06 11:29:14] [ info] [input] pausing systemd.5
[2025/01/06 11:29:14] [ info] [input] pausing fluentbit_metrics.6
[2025/01/06 11:29:14] [ info] [input] pausing storage_backlog.7
[2025/01/06 11:29:14] [ info] [input] pausing emitter_for_log_to_metrics.0
[2025/01/06 11:29:14] [ info] [input] pausing emitter_for_log_to_metrics.1
[2025/01/06 11:29:14] [ info] [output:cloudwatch_logs:cloudwatch_logs.0] thread worker #0 stopping...

Key indicators of failure:

  1. Fluent Bit reports Permission denied while attempting to access chunk files.
  2. Logs indicate a failure in segregating storage backlog chunks.
  3. As a result, Fluent Bit pauses all input plugins (e.g., container_logs, audit_logs, systemd).
  4. The output plugins, like cloudwatch_logs, fail to send logs due to the stopped inputs.

To Reproduce

  1. Use Fluent Bit version 2.2.2 and ensure log ingestion is functioning correctly.
  2. Upgrade to Fluent Bit version 3.2.2 (or any 3.1.* version).
  3. Observe the logs during startup.
  4. Check for the error messages related to Permission denied and input plugin pausing.
**Your Environment**
<!--- Include as many relevant details about the environment you experienced the bug in -->
* Version used: 3.2.2
* Configuration:
* Environment name and version (e.g. Kubernetes? What version?): 1.3.0
* Server type and version: 
* Operating System and version: suse- sle-micro-iso/5.5:2.0.4
* Filters and plugins:
 inputs: |
      [INPUT]
          Name                        tail
          Tag                         application.*
          Path                        /var/log/containers/*.log
          DB                          /var/fluent-bit/state/flb_container.db
          Exclude_Path                /var/log/containers/etcd*.log, /var/log/containers/kube-apiserver*.log, /var/log/containers/kube-controller-manager*.log, /var/log/containers/kube-proxy*.log, /var/log/containers/kube-scheduler*.log
          Parser                      docker
          Docker_Mode                 On
          Skip_Long_Lines             On
          Refresh_Interval            10
          Docker_Mode_Flush           5
          Docker_Mode_Parser          container_firstline
          Rotate_Wait                 30
          storage.type                filesystem
          Alias                       container_logs
          Read_from_Head              Off

      [INPUT]
          Name                        tail
          Alias                       audit_logs
          Tag                         kubeaudit.*
          Path                        /var/lib/rancher/rke2/server/logs/audit.log
          Parser                      docker
          DB                          /var/fluent-bit/state/audit_log.db
          Skip_Long_Lines             On
          Refresh_Interval            10
          Read_from_Head              Off
          Rotate_Wait                 30
          storage.type                filesystem

      [INPUT]
          Name                        tail
          Alias                       core_kubernetes_logs
          Tag                         kubernetes.components.core.*
          Path                        /var/log/containers/etcd*.log, /var/log/containers/kube-apiserver*.log, /var/log/containers/kube-controller-manager*.log, /var/log/containers/kube-proxy*.log, /var/log/containers/kube-scheduler*.log
          Parser                      docker
          DB                          /var/fluent-bit/state/core_kubernetes_logs.db
          Skip_Long_Lines             On
          Refresh_Interval            10
          Read_from_Head              Off
          storage.type                filesystem

      [INPUT]
          Name                        tail
          Alias                       core_kubernetes_logs
          Tag                         kubernetes.components.kubelet.*
          Path                        /var/lib/rancher/rke2/agent/logs/kubelet.log
          Parser                      docker
          DB                          /var/fluent-bit/state/kubelet_logs.db
          Skip_Long_Lines             On
          Refresh_Interval            10
          Read_from_Head              Off
          storage.type                filesystem

      [INPUT]
          Name                        systemd
          Tag                         sysd.auth
          Systemd_Filter              SYSLOG_FACILITY=4
          Systemd_Filter              SYSLOG_FACILITY=10
          Systemd_Filter_Type         Or
          DB                          /var/fluent-bit/state/authsysd.db
          Path                        /var/log/journal
          storage.type                filesystem
          Read_from_Tail              On

      [INPUT]
          Name                        systemd
          Tag                         sysd.generic
          DB                          /var/fluent-bit/state/genericsysd.db
          Path                        /var/log/journal
          storage.type                filesystem
          Read_from_Tail              On

      [INPUT]
          Name   fluentbit_metrics
          Tag    internal_metrics

    # -- https://docs.fluentbit.io/manual/pipeline/filters
    filters: |
      [FILTER]
          Name                log_to_metrics
          Match               sysd.auth
          Tag                 login_failure_metrics    
          Metric_mode         counter
          Metric_name         os_login_failures
          Metric_description  This metric counts all OS login failures
          Regex               MESSAGE .*authentication failure.*
          Label_field         MESSAGE
      
      [FILTER]
          Name                log_to_metrics
          Match               sysd.auth
          Tag                 login_success_metrics
          Metric_mode         counter
          Metric_name         os_login_successes
          Metric_description  This metric counts all successful OS logins
          Regex               MESSAGE .*New session.*
          Label_field         MESSAGE

      [FILTER]
          Name                parser
          Match               application.*
          Key_name            log
          Parser              crio

      [FILTER]
          Name                grep
          Match               sysd.generic
          Exclude             SYSLOG_FACILITY (4|10)$
          Regex               PRIORITY [0-4]$

      [FILTER]
          Name                kubernetes
          Match               application.*
          Kube_URL            https://kubernetes.default.svc:443
          Merge_Log           On
          Merge_Log_Key       log_processed
          Keep_Log            false
          K8S-Logging.Parser  On
          K8S-Logging.Exclude false
          Buffer_Size         0
          Kube_Tag_Prefix     application.var.log.containers.
          Labels              Off
          Annotations         Off
          Use_Kubelet         On
          Kubelet_Port        10250

      [FILTER]
          Name                kubernetes
          Match               kubernetes.components.core.*
          Kube_URL            https://kubernetes.default.svc:443
          Merge_Log           On
          Merge_Log_Key       log_processed
          Keep_Log            false
          K8S-Logging.Parser  On
          K8S-Logging.Exclude false
          Buffer_Size         0
          Kube_Tag_Prefix     kubernetes.components.core.var.log.containers.
          Labels              Off
          Annotations         Off
          Use_Kubelet         On
          Kubelet_Port        10250

      [FILTER]
          Name modify
          Match *
          Add cluster_id ${CLUSTER_ID}

      [FILTER]
          Name modify
          Match kubeaudit.*
          Add host_name ${HOST_NAME}

      [FILTER]
          Name modify
          Match kubernetes.components.kubelet.*
          Add host_name ${HOST_NAME}

    # -- https://docs.fluentbit.io/manual/pipeline/outputs
    outputs: |
      [OUTPUT]
          Name                     cloudwatch_logs
          Match                    application.*
          region                   {{ .Values.awsRegion }}
          log_group_name           /aws/containerinsights/${CLUSTER_ID}/application_logs
          log_stream_prefix        ${HOST_NAME}-
          log_retention_days       {{ .Values.logRetentionDays }}
          auto_create_group        true
          Retry_Limit              {{ .Values.retryLimit }}
          storage.total_limit_size {{ .Values.containerLogsFileBufferLimit }}

      [OUTPUT]
          Name                     cloudwatch_logs
          Match                    kubeaudit.*
          region                   {{ .Values.awsRegion }}
          log_group_name           /aws/containerinsights/${CLUSTER_ID}/kubernetes_audit_logs
          log_stream_prefix        ${HOST_NAME}-
          log_retention_days       {{ .Values.auditLogRetentionDays }}
          auto_create_group        true
          Retry_Limit              {{ .Values.retryLimit }}
          storage.total_limit_size {{ .Values.auditLogsFileBufferLimit }}

      [OUTPUT]
          Name                     cloudwatch_logs
          Match                    kubernetes.components.*
          region                   {{ .Values.awsRegion }}
          log_group_name           /aws/containerinsights/${CLUSTER_ID}/core_kubernetes_logs
          log_stream_prefix        ${HOST_NAME}-
          log_retention_days       {{ .Values.logRetentionDays }}
          auto_create_group        true
          Retry_Limit              {{ .Values.retryLimit }}
          storage.total_limit_size {{ .Values.coreKubernetesLogsFileBufferLimit }}

      [OUTPUT]
          Name                     cloudwatch_logs
          Match                    sysd.*
          region                   {{ .Values.awsRegion }}
          log_group_name           /aws/containerinsights/${CLUSTER_ID}/operating_system_logs
          log_stream_prefix        ${HOST_NAME}-
          log_retention_days       {{ .Values.logRetentionDays }}
          auto_create_group        true
          Retry_Limit              {{ .Values.retryLimit }}
          storage.total_limit_size {{ .Values.osLogsFileBufferLimit }}

      [OUTPUT]
          Name   prometheus_exporter
          Match  *_metrics

    # -- https://docs.fluentbit.io/manual/pipeline/parsers
    customParsers: |
      [PARSER]
         Name                docker
         Format              json
         Time_Key            time
         Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

      [PARSER]
         Name                crio
         Format              Regex
         Regex               ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>P|F) (?<log>.*)$
         Time_Key            time
         Time_Format         %Y-%m-%dT%H:%M:%S.%L%z

      [PARSER]
         Name                container_firstline
         Format              regex
         Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
         Time_Key            time
         Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

@ankitpanwar174 ankitpanwar174 changed the title Unbale to install fluent bit bit 3.2.2 , throwing error - cannot open chunk: etc/machine-id Unbale to install fluent bit 3.2.2 version , throwing error - cannot open chunk: etc/machine-id Jan 6, 2025
@ankitpanwar174
Copy link
Author

can someone please suggest what can we done to solve issue, we are blocked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant