Skip to content

Bug: InfluxDB reporter writes count=0 for Counter metrics since 3.25.0 (while /metrics shows non-zero) #4248

@KoBilla

Description

@KoBilla

Versions affected: 3.25.0 … 3.33.0 (last good: 3.24.0). See releases. https://github.com/prebid/prebid-server-java/releases

Environment

  • PBS-Java container
  • InfluxDB v1 HTTP API (/write?db=... on :8086)
  • Metrics enabled via metrics.influxdb (HTTP)
  • For cross-check: /metrics (collected-metrics) enabled (admin endpoints)
metrics:
  prefix: prebid
  influxdb:
    enabled: true
    host: influxdb1
    port: 8086
    protocol: http
    database: prebid_stats
    auth: prebid:password
    interval: 60
admin:
  port: 8060
  admin-endpoints:
    collected-metrics:
      enabled: true
      path: /metrics
      on-application-port: true
      protected: false

How to reproduce

  1. Start PBS-Java 3.33.0 with the config above.

  2. Generate traffic (e.g., a few OpenRTB2 web auctions without cookies; regular prod traffic also reproduces).

  3. Wait ≥ one reporting interval.

  4. Compare GET /metrics
    At the same timestamps, the Influx line-protocol sent by PBS contains count=0.0 for the same Counter metrics (while timers/histograms have sane values).

    • prebid_pbs_no_cookie_requests
    • prebid_pbs_requests_ok_openrtb2_web
    • prebid_pbs_adapter__no_cookie_requests

Evidence

  • Captured on the wire (tcpdump) at the exact time /metrics reported non-zero:
pbs_no_cookie_requests,metricName=no_cookie_requests count=0.0 1760624818
pbs_adapter.rubicon.no_cookie_requests,metricName=adapter.rubicon.no_cookie_requests count=0.0 1760624818
pbs_adapter.appnexus.no_cookie_requests,metricName=adapter.appnexus.no_cookie_requests count=0.0 1760624818
  • Timers/histograms (*_time with p50/p95/p99/count) are non-zero in Influx at the same time; the issue appears to affect Counter metrics only.

Capture command

sudo tcpdump -i any -s0 -nnA 'host <influx-host> and tcp port 8086' \
  | grep -E 'no_cookie_requests|requests\.(ok|badinput|networkerr)'

or

tcpdump -i ens3 -s0 -w /tmp/influx.pcap 'tcp port 8086'
tshark -r /tmp/influx.pcap -d tcp.port==8086,http -Y 'http.request.method == "POST"'   -T fields -e http.file_data | sed "s|\\\\n|\n|g" | grep no_cook

Probable cause
Bump com.izettle:dropwizard-metrics-influxdb z 1.2.2 → 1.3.4 (extra/pom.xml)
https://github.com/prebid/prebid-server-java/pull/3906/files

Influx graphs
The first part of the influx graph runs all 28 pbs on v3.24.0, then it's all v3.33.0 (all zeros), then only the node was rolled back to v.3.24.0
Image

Image

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Research

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions