-
Notifications
You must be signed in to change notification settings - Fork 28
Description
Edit: Found the issue: I had failed to set log_key
after the key changed from the default message
to a non default value.
I'm going to leave this open because I think there are some improvements which could have clarified this issue. Unfortunately I have neither the time nor the Ruby experience to open a PR right now.
- output.rb (and perhaps the plugin as a whole) seems to lack trace logs. These would have been helpful in confirming which parts of the write() function were executing, which is helpful in understanding where the issue occurs.
- The
log_key
value did not exist and I had configuredlog_format=text
which means the expected result should be that no logs are ever sent. I think this is an appropriate situation in which to issue a log, maybe even with levelwarn
informing the user that, for the current chunk, their configuration is useless.
I appreciate my suggestions may not be valid for this code. Feel free to close this as my issue related to misconfiguration and has been resolved.
I'm using the kube-logging operator: https://kube-logging.dev/docs/configuration/plugins/outputs/sumologic/
This issue arose as we updated from version 3.17 to 4.20 of the operator.
There are various sumologic related changes in that release, visible by searching in on the page of this very large diff.
kube-logging/logging-operator@release-3.17...4.2.0
However I've come here because the behaviour I'm getting from fluentd based on a single change to the Sumologic output configuration seems unexpected. I hope to get advice on how to move forward with this issue as it seems like a silent failure.
Feel free to close this issue or advise accordingly, I have not confirmed a bug although it does seem like there might be one.
The Sumologic Output is working as expected with log_format: json
however if I set log_format: text
fluentd stops sending logs.
<source>
@type forward
@id main_forward
bind 0.0.0.0
port 24240
</source>
<match **>
@type label_router
@id main
metrics true
<route>
@label @c1157f02c8c13fd3ea66f8419567c357
metrics_labels {"id":"flow:mynamespace:my-sumo-flow"}
<match>
labels my-sumo-label:enabled
namespaces mynamespace
negate false
</match>
</route>
... ...
</match>
<label @c1157f02c8c13fd3ea66f8419567c357>
<match kubernetes.**>
@type tag_normaliser
@id flow:mynamespace:my-sumo-flow:0
format ${namespace_name}.${labels.environment}.${pod_name}.${container_name}
</match>
<match **>
@type sumologic
@id flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output
endpoint COLLECTOR_URL_REMOVED
log_format json
source_name my-sumo-source
<buffer tag,time>
@type file
chunk_limit_size 32m
flush_interval 60s
flush_mode interval
path /buffers/flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output.*.buffer
retry_forever true
timekey 10m
timekey_wait 1m
total_limit_size 1024m
</buffer>
</match>
</label>
... ...
With log_format: json
I see fluend logging sends and I see logs in Sumologic.
$ tail -f -n 100000 /fluentd/log/out | grep "mynamespace:my-sumo"
2023-06-05 14:53:43 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Sending 2; logs records with source category '', source host '', source name 'my-sumo-source', chunk #5fd630dfe92e8520c65b0b2aedb3524c, try 0, batch 0
2023-06-05 14:58:21 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd6322428e95cb3b7fbd07ad24428b0" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976600, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
2023-06-05 14:59:22 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Sending 13; logs records with source category '', source host '', source name 'my-sumo-source', chunk #5fd6322428e95cb3b7fbd07ad24428b0, try 0, batch 0
2023-06-05 14:59:26 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd632622621c3c2585f88717d2c406e" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976600, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
With log_format: text
I no longer see fluentd logging sends and I no longer see logs in Sumologic.
$ tail -f -n 100000 /fluentd/log/out | grep "mynamespace:my-sumo"
2023-06-05 14:41:11 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd62e4de0a29e3c6c8d40ce47a99521" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976000, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
2023-06-05 14:42:16 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd62e8bdd2eb0db408da82be8213a70" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976000, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
2023-06-05 14:43:21 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd62ec9da55d6d9601586c71c835446" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976000, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
2023-06-05 14:50:00 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd63046447c860381363c4d438ee78d" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976600, tag="mynamespace.unknown.container-namer-6fd6b8b55c-gmdsf.container-namer", variables=nil, seq=0>
2023-06-05 14:50:51 +0000 [debug]: #0 [flow:mynamespace:my-sumo-flow:output:mynamespace:my-sumo-output] Created new chunk chunk_id="5fd6307703ab3dd92bbfb4c8b3a7dfc4" metadata=#<struct Fluent::Plugin::Buffer::Metadata timekey=1685976600, tag="mynamespace.unknown.test-logger.container-name", variables=nil, seq=0>
The only change happening in the configuration is the value of log_format
.
I do see buffers under /buffers.
The logging operator deploys three fluentd containers, I am monitoring and inspecting all of them at once when troubleshooting.
I am using the fluentd debug container and everything seems to be functioning as expected except this one puzzling issue. Any advice is very much appreciated.