Skip to content

Support Nested Aggregations as part of Star-Tree #18048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Shailesh-Kumar-Singh
Copy link
Contributor

Description

Support Nested Aggregations as part of Star-Tree

Related Issues

Resolves #17274

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 16f18d9: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

),
context
)
parent == null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets add some comments please on why its done this way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code can be extracted out and get reused ?

@Shailesh-Kumar-Singh Shailesh-Kumar-Singh force-pushed the feature/nested-aggregations branch from 16f18d9 to b3bee6c Compare April 24, 2025 10:41
Copy link
Contributor

❌ Gradle check result for b3bee6c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

*
* @return List of dimension field names involved in the aggregation.
*/
default List<String> getDimensionFilters() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StarTreePreComputeCollector is only involved for aggregations not including Metric aggregators. You can very well define the method here, instead of having the same implementation everywhere. (overlooking the overrides - the implementations look same.

/* For nested aggregations, we require the RemapGlobalOrdsStarTree strategy to properly
handle global ordinal remapping. This check ensures we don't reinitialize the
collectionStrategy again if it's already correctly set. */
if (parent != null && !(collectionStrategy instanceof RemapGlobalOrdsStarTree)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had fixed redundant initialization of globalOrds as part of this change #18059 (pending merge)

Make sure that the changes don't break.

),
context
)
parent == null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return null early in the method itself when parent != null

Basically we don't want to run the follwoing steps in case parent is not null - since the parent will be executing all this:

StarTreeValues starTreeValues = StarTreeQueryHelper.getStarTreeValues(ctx, starTree);
SortedNumericStarTreeValuesIterator valuesIterator = (SortedNumericStarTreeValuesIterator) starTreeValues
            .getDimensionValuesIterator(fieldName);
SortedNumericStarTreeValuesIterator docCountsIterator = StarTreeQueryHelper.getDocCountsIterator(starTreeValues, starTree);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return null early in the method itself when parent != null

This would not work I believe, if parent != null, we would still need to return an object of StarTreeBucketCollector as we have to call collectStarTreeEntry() for the subcollectors.

for (StarTreeBucketCollector subCollector : collector.getSubCollectors()) {
           subCollector.collectStarTreeEntry(entryBit, bucketOrd);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I missed that. No worries.

}
}

if (!isValid) return false;
Copy link
Contributor

@sandeshkr419 sandeshkr419 Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Just to maintain boolean check consistency across project: isValid == false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

@sandeshkr419
Copy link
Contributor

Thanks @Shailesh-Kumar-Singh for working on this.

Code wise, I only have a few refactoring comments on the implementation.

I have some concerns regarding the supported cases:

What is the break-point of nested aggregators to use or not use star-tree, like upto how many nested levels?
So for too many nested levels, I'm concerned that the depth_first mode might be utilized (I'm not sure, you need to verify it) - so let's add depth first test cases if required.
In my PRs earlier, I had skipped using star-tree for depth-first cases since they were out of scope of the query shapes which were supported then.
For high cardinality keyword terms or timestamp fields (date-histograms) with nested setup, it might be beneficial to run some benchmarks if using star-tree is beneficial or not - else let's think of some gate-keeping based on the number of nested levels. Basically we don't want the star-tree to run for non-performant cases.

Copy link
Contributor

❌ Gradle check result for a58027f: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Aggregations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Star Tree] [Search] Nested Aggregations
3 participants