-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Fix index lookup when field-caps returns empty mapping #132138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi @smalyshev, I've created a changelog YAML for you. |
Pinging @elastic/es-analytical-engine (Team:Analytics) |
() -> runQuery("FROM logs-*,c*:logs-* | LOOKUP JOIN values_lookup ON v | KEEP v", randomBoolean()) | ||
); | ||
// FIXME: strictly speaking this message is not correct, as the index is available, but the field is not | ||
assertThat(ex.getMessage(), containsString("lookup index [values_lookup] is not available")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be tricky to fix?
It would be nice to avoid misleading the user
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat tricky, as when we get empty resolution, we don't really know if that's because the index is missing or the fields are missing. I could try and see if there's a way around it...
@@ -374,6 +374,34 @@ public void testLookupJoinMissingKey() throws IOException { | |||
); | |||
assertThat(ex.getMessage(), containsString("Unknown column [local_tag] in right side of join")); | |||
|
|||
// Add KEEP clause to try and trick the field-caps result parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain why this is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because when KEEP
is present, field-caps call only fetches the restricted set of fields, and the code here:
boolean allEmpty = true;
for (FieldCapabilitiesIndexResponse ir : fieldCapsResponse.getIndexResponses()) {
allEmpty &= ir.get().isEmpty();
}
// If all the mappings are empty we return an empty set of resolved indices to line up with QL
var index = new EsIndex(indexPattern, rootFields, allEmpty ? Map.of() : concreteIndices, partiallyUnmappedFields);
in IndexResolver can produce empty mapping. Which hasn't been handled properly by the code because I assumed this could not happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something needs to be fixed, but I don't think it's this teams' responsibility.
The bug fix here is an workaround for this bug #107031.
This specific issue with empty lookup
index is reproduceable when the mapping of that index is {}
.
If the mapping of the lookup index is still "empty", but "empty" is expressed as
{
"mappings": {
"properties" : {
}
}
}
then the bug is not reproduceable. And the output of field_caps
shouldn't be empty no matter what (see the bug report above).
@javanna @piergm we hit again the bug report above from more than an year ago, this time with lookup join
functionality in ES|QL. Is there anything stopping that great PR from @piergm to be merged?
@astefan I am not sure I understand. The code that converts empty mappings into empty resolution is in
converts it to empty resolution. I'm not sure how #107031 is relevant here, could you explain? |
@smalyshev sure. Happy to provide more context. If I create an index with no fields like this: And this something is actually useful to ESQL. I know for sure we use
But if I create an index like
Why is not ES providing |
@astefan but getting nothing from field-caps is not unusual. If I use |
Now I am not sure of what the actual scenario is that this PR is trying to fix. I tried to read between the lines in comments, PR description and tests what is the failing scenario and the way you created the text index matters. For the record, the second test scenario above that I mentioned reproduce the issue this PR is trying to fix, but the first one doesn't with a query like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've updated my research and tests with the updated reproducing data from the description of the PR and I believe the fix in this PR is ok.
For completeness sake, please add three more tests to this PR:
- one that uses a
values_lookup
index that is empty created like thisPUT /values_lookup
with request body as{}
- one that uses a
values_lookup
index that is empty created like thisPUT /values_lookup
with request body as{"mappings":{"properties" :{}},"settings":{"index.mode":"lookup"}}
- one that uses a
values_lookup
index that is empty created like thisPUT /values_lookup
with request body as{"mappings":{"properties" :{}}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in IndexResolver
since this is likely to be a question for others looking at that code here, add a comment something like
Introduced with #46775
We need to be able to differentiate between an empty mapping index and an empty index due to fields not being found. An empty mapping index will generate no columns (important) for a query likeFROM empty-mapping-index
, whereas an empty result here but for fields that do not exist in the index (but the index has a mapping) will result in "VerificationException Unknown column" errors.
522234a
to
f18b5b7
Compare
@astefan I think I was able to fix the error message discrepancy and also added some empty index tests. Please review again. Thanks! |
@@ -518,6 +523,16 @@ private PreAnalysisResult receiveLookupIndexResolution( | |||
} | |||
return result.addLookupIndexResolution(index, lookupIndexResolution); | |||
} | |||
|
|||
if (lookupIndexResolution.get().indexNameWithModes().isEmpty() && lookupIndexResolution.resolvedIndices().isEmpty() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which scenario is this possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same scenario we're discussing, but instead of local cluster we have remote clusters. Then the branch above will not be executed, but the code will try to validate remote cluster's indices. But since we've got the empty mappings, it would be confused into thinking there's no indices there, and report a wrong error. So this part fixes it and let's the verifier report the correct error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same scenario we're discussing
Can you please, be specific? (scenarios confusion led to long discussions and prolonging this PR merge)
In my mind there are these three scenarios plus the ones where the mapping of the lookup index is not empty but the query uses keep v
. Plus those involving CCS.
Note: Every scenario above should be tested with local only indices (no CCS) as well, btw (can be done in a follow up PR or opened an issue for it to be addressed later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the scenario where we are calling field-caps to resolve lookup join, with field list set to some restricted list (not "*") and the lookup index does not have any of those fields, so we're receiving the field-caps response with an empty map. Example:
FROM remote:logs-*| LOOKUP JOIN values_lookup ON v | KEEP v
this is the same query as in the issue description, just with the remote cluster, not the local one. The difference exists because local-only and remote scenarios are handled by different code branches, due to skip_unavailable complexities etc., so the initial fix only covered the local-only branch and led to the fact that the remote scenario produced wrong error message - "unknown index" instead of "field missing".
testLookupJoinEmptyIndex
tests the three empty index scenarios you've described (as much as I could reproduce them from transport level test - it doesn't really do JSON REST APIs there) and it tests them in both local and remote scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Field-caps resolution can return empty resolution if it receives empty mapping. This can happen when using KEEP, for example:
when
values_lookup
does not have the fieldv
. In this case, field-caps will return the empty mapping, since there are no fields that match the requested field set, and this code in IndexResolver will convert it to empty resolution:which breaks the response processing code for lookup indices.
Closes #132105