You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: wiki/Alerts-and-Us.md
+28-26Lines changed: 28 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -150,11 +150,11 @@ Uptime will calculate the total downtime for the alert
150
150
| January 2025 | 0s |
151
151
| February 2025 | 0s |
152
152
| March 2025 | 5m 8s |
153
-
|**April 2025**|**10m 30s**|
154
-
| May 2025 ||
155
-
| June 2025 ||
156
-
| July 2025 ||
157
-
| August 2025 ||
153
+
| April 2025 | 10m 30s |
154
+
| May 2025 | 0s |
155
+
| June 2025 | 0s |
156
+
| July 2025 | 0s |
157
+
|**August 2025**|**0s**|
158
158
| September 2025 ||
159
159
| October 2025 ||
160
160
| November 2025 ||
@@ -199,11 +199,11 @@ Uptime will calculate the total downtime for the alert
199
199
| January 2025 | 9m 28s |
200
200
| February 2025 | 0s |
201
201
| March 2025 | 5m 14s |
202
-
|**April 2025**|**28m 50s**|
203
-
| May 2025 ||
204
-
| June 2025 ||
205
-
| July 2025 ||
206
-
| August 2025 ||
202
+
| April 2025 | 28m 50s|
203
+
| May 2025 |18m 28s |
204
+
| June 2025 | 0s |
205
+
| July 2025 | 0s |
206
+
|**August 2025**|**0s**|
207
207
| September 2025 ||
208
208
| October 2025 ||
209
209
| November 2025 ||
@@ -244,6 +244,8 @@ Note: January outage was due to a testing password renewal and did not effect cl
244
244
245
245
## 2025
246
246
247
+
As of May 2025 The analytics are no longer provided by JSM
248
+
247
249
248
250
##### P1 Stats
249
251
| Month | Number of Alerts | Acknowledge Time | Resolve Time | Notes |
@@ -252,10 +254,10 @@ Note: January outage was due to a testing password renewal and did not effect cl
252
254
| February | 0 | NA | NA | NA |
253
255
| March | 4 || 5m 50s | All related to short outage caused by production upgrade. Also many related to BCeID checks |
254
256
| April | 5 | 8m 24s | 19m 22s | Caused by network outage on April 13 which resulted in successful switch to GoldDR and an IDIR login failure on April 22 |
255
-
| May |||||
256
-
| June |||||
257
-
| July |||||
258
-
| August |||||
257
+
| May |4|||IDIR outage on the 5 is largely responsible for the P1 alert. Two alerts on may 4th were from route changes and resolve in under 4 min each.|
258
+
| June |0|NA|NA|NA|
259
+
| July |0|NA|NA|NA|
260
+
| August |0|NA|NA|NA|
259
261
| September |||||
260
262
| October |||||
261
263
| November |||||
@@ -268,10 +270,10 @@ Note: January outage was due to a testing password renewal and did not effect cl
268
270
| February | 2 | 31s | 31s | CPU spike on Feb 18, no user impact |
269
271
| March | 0 | NA | NA | NA |
270
272
| April | 0 | NA | NA | NA |
271
-
| May |||||
272
-
| June |||||
273
-
| July |||||
274
-
| August |||||
273
+
| May |0|NA|NA|NA|
274
+
| June |1|1m|1m|Not a real pod outage, new sysdig monitoring takes time to add new kc pods to the count|
275
+
| July |5|NA|NA | All false low pod warnings except one elevated CPU warning. No outage associated.|
276
+
| August |0|NA|NA|NA|
275
277
| September |||||
276
278
| October |||||
277
279
| November |||||
@@ -284,10 +286,10 @@ Note: January outage was due to a testing password renewal and did not effect cl
284
286
| February | 7 | NA | NA | Some data lost on these alerts due to migration. All patroni pod warnings and Dev warnings due to upgrade. Unavoidable impact to developers done after working hours. |
285
287
| March | 5 | 14m 5s | 14m 5s | dev and test idir monitoring were down for short intervals |
286
288
| April | 11 | NA | NA | Some data lost on these alerts due to migration. Mostly due to dev and test IDIR checks failing. |
287
-
| May |||||
288
-
| June |||||
289
-
| July |||||
290
-
| August |||||
289
+
| May |2|NA|NA|Two idir test env outages on may 5th|
290
+
| June |3|NA|NA|All ready pod low warnings during roll outs. No service impact|
291
+
| July |1|NA|NA|DB pod low warning|
292
+
| August |0|NA|NA|NA|
291
293
| September |||||
292
294
| October |||||
293
295
| November |||||
@@ -300,10 +302,10 @@ Note: January outage was due to a testing password renewal and did not effect cl
300
302
| February | 17 | NA | NA | Some data lost on these alerts due to migration. No enduser impact all internal warnings. |
301
303
| March | 7 | 4m 42s | 1h 49m 25s | All warnings about elevated CPU and filesystem. No enduser impact |
302
304
| April | 0 | NA | NA | NA |
303
-
| May |||||
304
-
| June |||||
305
-
| July |||||
306
-
| August |||||
305
+
| May |0|NA|NA|NA|
306
+
| June |0|NA|NA|NA|
307
+
| July |1|NA|NA|Sustained elevated CPU warning, no outage, due to pod rollover|
0 commit comments