-
Notifications
You must be signed in to change notification settings - Fork 2k
Fix QueueResizableOpenSearchThreadPoolExecutorTests #18006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix QueueResizableOpenSearchThreadPoolExecutorTests #18006
Conversation
❌ Gradle check result for 7e471e1: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for cc0563c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@andrross I think the name of the method may be confusing; just looking into the test, what it does is that
Is it helpful? |
Got it. I think we can keep the test as it is. |
cc0563c
to
2a562b9
Compare
❌ Gradle check result for 2a562b9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 2a562b9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
2a562b9
to
7279456
Compare
❌ Gradle check result for 7279456: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 7279456: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 7279456: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
7279456
to
a6990d9
Compare
❌ Gradle check result for a6990d9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for a6990d9: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #18006 +/- ##
============================================
+ Coverage 72.57% 72.59% +0.02%
- Complexity 67160 67170 +10
============================================
Files 5478 5478
Lines 310130 310132 +2
Branches 45087 45087
============================================
+ Hits 225068 225147 +79
+ Misses 66661 66573 -88
- Partials 18401 18412 +11 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a race condition in testResizeQueueDown() where depending on random parameters we could submit up to 1002 tasks into an executor with a queue size of 900. That introduced a race condition where if the tasks didn't execute fast enough then a rejected execution exception could happen and fail the test.
I am unable to see on what fixed this flakiness in the test testResizeQueueDown?
ThreadContext context = new ThreadContext(Settings.EMPTY); | ||
this.queue = new ResizableBlockingQueue<>( | ||
ConcurrentCollections.newBlockingQueue(), | ||
Objects.requireNonNull(queueSize, "All tests must set a queue size") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Objects.requireNonNull() is designed for reference types only. For primitive data types, this check would always pass. Did you intend to do something else? may be value check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! This was leftover from a previous version of the code where it actually made sense.
} | ||
|
||
/** Use a runnable wrapper that simulates a task with unknown failures. */ | ||
public void testExceptionThrowingTask() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for testExceptionThrowingTask, the runnable wrapper was earlier exceptionalWrapper()
which is changed to fastWrapper
. What's the reason for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another good catch, this got mixed up in the refactoring. I'll fix it.
Instead of resizing down to 900, it resizes down to 1500 which guarantees that the executor has enough capacity to not reject anything if all tasks are submitted before any are able to be executed. |
a6990d9
to
210949d
Compare
❌ Gradle check result for 210949d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a race condition in testResizeQueueDown() where depending on random parameters we could submit up to 1002 tasks into an executor with a queue size of 900. That introduced a race condition where if the tasks didn't execute fast enough then a rejected execution exception could happen and fail the test. The fix is to resize down to a queue size of 1500 to ensure there is enough capacity even if all tasks are submitted before any can be executed. And finally I refactored the tests to reduce duplication of code and ensure the executor gets shutdown properly even in case of a test failure. This will avoid the spurious thread leak failure if a test case exits because of a failure. Signed-off-by: Andrew Ross <andrross@amazon.com>
210949d
to
7ba1155
Compare
❌ Gradle check result for 7ba1155: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 7ba1155: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a race condition in testResizeQueueDown() where depending on
random parameters we could submit up to 1002 tasks into an executor with
a queue size of 900. That introduced a race condition where if the tasks
didn't execute fast enough then a rejected execution exception could
happen and fail the test. The fix is to resize down to a queue size of
1500 to ensure there is enough capacity even if all tasks are submitted
before any can be executed.
And finally I refactored the tests to reduce duplication of code and
ensure the executor gets shutdown properly even in case of a test
failure. This will avoid the spurious thread leak failure if a test case
exits because of a failure.
Related Issues
Resolves #14297
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.