Skip to content

Flaky MetricAggregatorTests #18331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 22, 2025
Merged

Conversation

ajleong623
Copy link
Contributor

@ajleong623 ajleong623 commented May 19, 2025

Description

This change will make sure that the MetricAggregatorTests does not fail on the range query. What happened was that in the test, the range query was applied on a terms, but the difference between the expected aggregation and the actual aggregation was not zero. On a field called "keyword_field", the boolean query had 2 should queries (range from 18 to 0) and (exact match of 37).

In the expected aggregation, only the documents which matched on 37 were added together, but in the actual aggregation, more documents which somehow satisfied the range from 18 to 0 were added together. I think what happened was that the "keyword_field" was a string value, and in the range query, the values are interpreted as mapped ordinals in the star tree filters (https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/search/startree/filter/RangeMatchDimFilter.java), but it seems like in the expected query which does not use star tree filters, the strings are compared as its byte ref value (https://lucene.apache.org/core/6_6_0/core/org/apache/lucene/search/TermRangeQuery.html). At least for now, I think that we should avoid using keyword strings for a range query.

Related Issues

Resolves #18110

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@ajleong623 ajleong623 marked this pull request as ready for review May 19, 2025 05:55
@github-actions github-actions bot added the >test-failure Test failure from CI, local build, etc. label May 19, 2025
@ajleong623 ajleong623 requested a review from mch2 as a code owner May 19, 2025 05:55
@github-actions github-actions bot added autocut disabled-test Issues that are used by an AwaitsFix annotation to temporarily disable a broken test labels May 19, 2025
@ajleong623 ajleong623 requested a review from msfroh as a code owner May 19, 2025 05:55
@github-actions github-actions bot added the flaky-test Random test failure that succeeds on second run label May 19, 2025
@ajleong623 ajleong623 requested a review from owaiskazi19 as a code owner May 19, 2025 05:55
Copy link
Contributor

❌ Gradle check result for 33d077b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
Copy link
Contributor

❌ Gradle check result for 501fbcb: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@ajleong623 ajleong623 closed this May 19, 2025
@ajleong623 ajleong623 reopened this May 20, 2025
@ajleong623 ajleong623 marked this pull request as draft May 20, 2025 00:18
Copy link
Contributor

❕ Gradle check result for 501fbcb: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented May 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.47%. Comparing base (19aa0f8) to head (eb7f003).
Report is 17 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #18331      +/-   ##
============================================
- Coverage     72.53%   72.47%   -0.06%     
+ Complexity    67389    67352      -37     
============================================
  Files          5488     5488              
  Lines        311069   311069              
  Branches      45217    45217              
============================================
- Hits         225622   225443     -179     
- Misses        67083    67266     +183     
+ Partials      18364    18360       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ajleong623 ajleong623 marked this pull request as ready for review May 20, 2025 01:38
Copy link
Contributor

@sandeshkr419 sandeshkr419 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ajleong623

Appreciate your time to debug this case.
I have minor comments.

Also, along with this change, can you also ensure that instead of assigning raw queryBuilders using RangeQueryBuilder/TermQueryBuilder constructors at places, we use getTermQueryBuilder(), getRangeQueryBuilder() methods directly. I know this is not part of your changes, but a little code cleanup can certainly help.

Also, please run the test some n(100-500) times (can be configured with intellij) to assert we are solving the flakiness correctly.

Thanks!

…n range queries

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
@ajleong623
Copy link
Contributor Author

@sandeshkr419 Thank you so much for the suggestions and review. I believe I made the requested changes and also ran the test 100 times. Let me know if there are any other suggestions.

Copy link
Contributor

@sandeshkr419 sandeshkr419 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added a few minor nit-picks. The rest looks good.

Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
Signed-off-by: Anthony Leong <aj.leong623@gmail.com>
Copy link
Contributor

❕ Gradle check result for eb7f003: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@ajleong623
Copy link
Contributor Author

The new changes are ready.

@andrross andrross merged commit 243ba6a into opensearch-project:main May 22, 2025
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autocut disabled-test Issues that are used by an AwaitsFix annotation to temporarily disable a broken test flaky-test Random test failure that succeeds on second run Search:Aggregations skip-changelog >test-failure Test failure from CI, local build, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[AUTOCUT] Gradle Check Flaky Test Report for MetricAggregatorTests
3 participants