Logs not being flushed after x amount of time

Check [CONTRIBUTING guideline](https://github.com/fluent/fluentd/blob/master/CONTRIBUTING.md) first and here is the list to help us investigate the problem.

**Describe the bug**
I have an environment running Fluent-bit > Fluentd > Elasticsearch.
For a while my logs are flushed as they should, but after a while the logs stops being flushed and the buffer grows until the port gets blocked becuase of (overflow_action block).

What I have seen:
fluentd_output_status_num_errors is high for certain matches, however most matches have some errors.
Q: How do i view these errors? I dont see them in the logs even though I have tried with debug logging.

fluentd_output_status_buffer_queue_length is very high for two specific matches. These two matches have retry_max_interval 30, yet after a few hours they are still growing. These two matches are the only matches where I do an "include" only.

Q: How do i stop this from happening? Is there a way to see whats holding up the queue? I have checked the buffer path and the log which came in first is looking like a normal log which has been processed before.

**To Reproduce**


**Expected behavior**
The logs should constantly flow through, if there are any errors they should be printed in the logs.

**Your Environment**
Running in a container with:
repository: gcr.io/google-containers/fluentd-elasticsearch
tag: v2.4.0

**Your Configuration**

```
Here is my match block:
<match auth.keycloak.app>
   @type rewrite_tag_filter
   <rule>
     key log
     pattern /org.keycloak.events/
     tag keycloak.auth
   </rule>
 </match>

 <match keycloak.auth>
   @type rewrite_tag_filter
   <rule>
     key log
     pattern /type=(?<type>[^ ]+)(?<!,)/
     tag authorization.keycloak.$1
   </rule>
 </match>

    <match authorization.keycloak.**>
      @id elasticsearch-keycloak-authorization
      @type elasticsearch
      @log_level error
      log_es_400_reason true
      include_tag_key true
      host "#{ENV['OUTPUT_HOST']}"
      port "#{ENV['OUTPUT_PORT']}"
      scheme "#{ENV['OUTPUT_SCHEME']}"
      ssl_version "#{ENV['OUTPUT_SSL_VERSION']}"
      user client
      password "#{ENV['OUTPUT_PASSWORD']}"
      ssl_verify false
      logstash_format true
      logstash_prefix auth-keycloak
      <buffer>
        @type file
        flush_mode immediate
        path /var/log/fluentd-buffers/authorization-keycloak.buffer
        retry_type exponential_backoff
        flush_thread_count 2
        retry_limit 20
        retry_max_interval 30
        chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
        queue_limit_length "#{ENV['OUTPUT_BUFFER_QUEUE_LIMIT']}"
        overflow_action block
      </buffer>
    </match>
Any help is much appreciated!

```

**Your Error Log**
There is no error log! I see the problem through prometheus, buffer just keeps growing and does not drop until service has been redeployed or restarted.
```

```

**Additional context**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Logs not being flushed after x amount of time #2969

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Logs not being flushed after x amount of time #2969

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions