Skip to content

Plugin does not handle idle disconnects in a timely fashion. Results in dropped logs #38

@UserNotFound

Description

@UserNotFound

The Papertrail documentation says idle connection timeouts should be handled by the client, ie this plugin (emphasis mine):

Papertrail automatically closes idle TCP connections after 15 minutes of inactivity. Log senders should already handle this as part of a full implementation, since connections are routinely affected by Internet reachability changes. Senders can elect to immediately reconnect, or leave the connection closed until new log messages are available.

This plugin does retry, but the behavior is still rather impactful for delivery:

  • plugin makes a socket connection to papertrail
  • some time passes without logging, and papertrail closes the connection due to the idle timeout
  • logs are generated, and the plugin attempts to write to the socket
  • after 10+ minutes, (possibly only triggered by continued writes) the socket write times out
  • a new socket is created, and the chunk is "retried successfully", but log messages are missing on Papertrail

When I started debugging this, I thought it was just delayed messages, but I'm also finding that some messages are dropped entirely during this process. Is it possible to handle this more gracefully, like with a tcp keepalive?

This syslog output plugin seems to do so (the Socket changes are here)


Steps to reproduce:

  1. configure fluentd to capture logs and send them via this plugin to papertrail
  2. Log infrequently, eg:
root@89f331735970:/app# date; sleep 3600; date
Tue Feb 14 16:39:04 UTC 2023
Tue Feb 14 17:39:04 UTC 2023
root@89f331735970:/app# while true; do date; sleep 10; done
Tue Feb 14 17:55:03 UTC 2023
Tue Feb 14 17:55:13 UTC 2023
Tue Feb 14 17:55:23 UTC 2023
Tue Feb 14 17:55:33 UTC 2023
Tue Feb 14 17:55:43 UTC 2023
  1. See that the log lines for "17:39:04" "17:55:03" and "17:55:13" are dropped and not received in papertrail
    Screen Shot 2023-02-14 at 10 58 38

  2. Confirm in the fluentd logs that there was a socket write timeout error

2023-02-14 16:14:50 +0000 [info]: #0 initializing tcp socket for logs3.papertrailapp.com:redacted_port
2023-02-14 17:55:24 +0000 [warn]: #0 failed to flush the buffer. retry_times=0 next_retry_time=2023-02-14 17:55:25 +0000 chunk="5f4acaafd238fdba7d32b4b3b001c59f" error_class=Fluent::Papertrail::SocketFailureError error="Error writing to logs3.papertrailapp.com:redacted_port: Operation timed out"
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/2.7.0/openssl/buffering.rb:322:in `syswrite'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/2.7.0/openssl/buffering.rb:322:in `do_write'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/2.7.0/openssl/buffering.rb:409:in `puts'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-papertrail-0.2.8/lib/fluent/plugin/out_papertrail.rb:133:in `send_to_papertrail'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-papertrail-0.2.8/lib/fluent/plugin/out_papertrail.rb:45:in `block in write'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/event.rb:315:in `each'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/event.rb:315:in `block in each'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/buffer/memory_chunk.rb:81:in `open'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/buffer/memory_chunk.rb:81:in `open'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/event.rb:314:in `each'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-papertrail-0.2.8/lib/fluent/plugin/out_papertrail.rb:39:in `write'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/compat/output.rb:131:in `write'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:1179:in `try_flush'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:1500:in `flush_thread_run'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:499:in `block (2 levels) in start'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
 2023-02-14 17:55:25 +0000 [info]: #0 initializing tcp socket for logs3.papertrailapp.com:redacted
2023-02-14 17:55:25 +0000 [warn]: #0 retry succeeded. chunk_id="5f4acaafd238fdba7d32b4b3b001c59f"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions