-
Notifications
You must be signed in to change notification settings - Fork 66
✨ API Call Alerts #2139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ API Call Alerts #2139
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
5051cf6
to
ef52591
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2139 +/- ##
=======================================
Coverage 72.77% 72.77%
=======================================
Files 79 79
Lines 7340 7340
=======================================
Hits 5342 5342
Misses 1652 1652
Partials 346 346
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
441eca5
to
ac2ecd6
Compare
@@ -24,7 +25,7 @@ var ( | |||
) | |||
|
|||
const ( | |||
testSummaryOutputEnvVar = "GITHUB_STEP_SUMMARY" | |||
testSummaryOutputEnvVar = "E2E_SUMMARY_OUTPUT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
I think is better keep generic since we can run the same with or not GITHUB
Very cool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dtfranz
Thank you a lot just remove the commented code and I think we are good to move forward
ac2ecd6
to
6c168e2
Compare
@@ -39,8 +40,18 @@ func TestMain(m *testing.M) { | |||
utilruntime.Must(err) | |||
|
|||
res := m.Run() | |||
err = utils.PrintSummary(testSummaryOutputEnvVar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking and seem that should be skiped already see:
Then, why we just call it here and not in all e2e tests?
Could you help me understand how it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to keep the mechanism of selecting output destination distinct in summary.go
and e2e_suite_test.go
. For PrintSummary()
you supply a file path, and if that's empty then we warn and skip, since that's an invalid use of the func. In e2e_suite_test.go
you supply env, and if that's empty then we skip the summary with a note, because it is totally valid to run e2e without summary. This allowed me to keep the e2e test runner as simple as possible and follow the convention of transparently supplying e2e arguments via env. But I probably overthought it 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But each test has one suite right:
- https://github.com/operator-framework/operator-controller/blob/main/test/experimental-e2e/experimental_e2e_test.go#L45-L56
- https://github.com/operator-framework/operator-controller/blob/main/test/upgrade-e2e/upgrade_e2e_suite_test.go#L30-L60
- https://github.com/operator-framework/operator-controller/blob/main/test/extension-developer-e2e/extension_developer_test.go#L24-L214
So, why we use it only in this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summary is also generated for experimental e2e, you can see it in here. The upgrade and extension developer tests are pretty pointless to run prometheus on because they only have about a few seconds of actual runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the answers
@tmshort, I think we will need your help too.
It should allow us to sync downstream and solve the current error.
if err != nil { | ||
// Fail the run if alerts are found | ||
fmt.Printf("%s", err) | ||
os.Exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold
This actually causes an issue downstream. It seems that PrintSummary
fails there. We need to come up with a solution that works for both upstream and downstream.
Also, did we want to use %v
rather than %s
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #2144 for my workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PrintSummary
cannot return an error anymore, I disabled that to prevent it from failing the test. I can remove this section though if you want, but the idea was to have PrintSummary
return an error for alerts once this is stable.
In addition, downstream does not set E2E_SUMMARY_OUTPUT
, so this code will never be reached. I don't think your workaround is necessary since this should work better as a permanent solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, #2144 merged, and you'll need to rebase onto that (it also had another fix).
If PrintSummary
can no longer return an error, then it should have no return values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the error returns was part of fixing the downstream. If that's fine now then I will add the error returns for alerts back in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, we still want it to exit according the the result of the test. If PrintSummary
were to return an error, the test will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the E2E_SUMMARY_OUTPUT
variable won't be set downstream...
/unhold |
Do not fail e2e run when issues with summary generation are encountered. Fail the run if alerts are encountered. Add prometheus alerts for excessive API calls from operator-controller or catalogd, as well as summary graphs to match. Signed-off-by: Daniel Franz <dfranz@redhat.com>
6c168e2
to
d790ce5
Compare
I think we need a way to add exceptions to the |
Thank you for your feedback @tmshort , the branch is updated now per your comments; PTAL! |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: camilamacedo86, grokspawn, tmshort The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The go-apidiff is acceptable, it's in test code. |
@tmshort: /override requires failed status contexts, check run or a prowjob name to operate on.
Only the following failed contexts/checkruns were expected:
If you are trying to override a checkrun that has a space in it, you must put a double quote on the context. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/override go-apidiff |
@tmshort: Overrode contexts on behalf of tmshort: go-apidiff In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/check-cla |
/test go-apidiff |
@grokspawn: No presubmit jobs available for operator-framework/operator-controller@main In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/no-override go-apidiff |
ad199f1
into
operator-framework:main
Description
Adds prometheus alerts for excessive API calls from operator-controller or catalogd, as well as summary graphs to match.
Do not fail e2e run when issues with summary generation are encountered or when the output isn't specified.
Reviewer Checklist