Sysdig Trigger #45
jlangy
announced in
Operations
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Starting a thread for a curious sysdig alert that came up. In sysdig we have an alert for the number of ready patroni pods, using the formula
sum(avg(kubernetes.pod.status.ready))
. We have 3 pods, and a Low severity alert if that drops below 3.When one of the pods spiked in CPU, it caused the sysdig trigger to go off (dropped to 2.98 for a bit), the CPU spike is below:
Strange thing is no events got logged in openshift, even though sysdig showed it drop:
I would think if one pod had lost its ready status the kuberentes API would log an event.
Wondering if I am misinterpreting something here, or maybe a pod can lose its ready status temporarily without logging an event?
NB:As a side note, I dropped to measure the avg < 2.5, might make more sense to use the sum(min(kubernetes.pod.status.ready)) though instead
Beta Was this translation helpful? Give feedback.
All reactions