Skip to content

POC #131168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 33 commits into
base: main
Choose a base branch
from
Draft

POC #131168

wants to merge 33 commits into from

Conversation

n1v0lg
Copy link
Contributor

@n1v0lg n1v0lg commented Jul 13, 2025

No description provided.

@n1v0lg n1v0lg self-assigned this Jul 13, 2025
@elasticsearchmachine elasticsearchmachine added v9.2.0 serverless-linked Added by automation, don't add manually labels Jul 13, 2025

boolean checkRemote(List<RemoteClusterService.RemoteTag> tags);

record RewrittenIndexExpression(String original, List<String> rewritten) {}
Copy link
Contributor

@quux00 quux00 Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also need to check (and track) whether each component of the rewritten list is flat or qualified? Is that left for a later impl or do you think we don't need to track that here?

@@ -97,6 +98,33 @@ public final class RemoteClusterService extends RemoteClusterAware
(ns, key) -> boolSetting(key, true, new RemoteConnectionEnabled<>(ns, key), Setting.Property.Dynamic, Setting.Property.NodeScope)
);

public record RemoteTag(String key, String value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local project also has tags, so my guess is that this doesn't belong in RemoteClusterService and shouldn't be called RemoteTag? Maybe MetadataTag or ProjectTag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ProjectTag makes sense though we might also go broader and just call it metadata in case we also want it to cover project alias and such.

if (routingTags != null) {
searchRequest.routingTags(
Arrays.stream(Strings.splitStringByCommaToArray(routingTags)).map(RemoteClusterService.RemoteTag::fromString).toList()
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you give an example of the type/format of tags you are allowing here? And is there an implicit "ANDing" of those tags?

As a first impl that might be useful.

Of course to state the obvious (not a criticism)

  1. it doesn't handle ES|QL if that is embedded in the query
  2. it doesn't support the complex AND, OR, NOT and grouping logic we'll eventually have to implement
    so later we'll need some additional constructs/interfaces around parsing that more complex scenario.

@idegtiarenko idegtiarenko marked this pull request as ready for review July 15, 2025 09:00
@idegtiarenko idegtiarenko requested a review from a team as a code owner July 15, 2025 09:00
@idegtiarenko idegtiarenko marked this pull request as draft July 15, 2025 09:00
@@ -124,6 +132,11 @@ ResolvedIndices resolve(
if (request instanceof IndicesRequest == false) {
throw new IllegalStateException("Request [" + request + "] is not an Indices request, but should be.");
}

if (request instanceof CrossProjectRequest rewritableIndicesRequest) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also need to go in tryResolveWithoutWildcards to handle non-wildcard expressions

@n1v0lg n1v0lg changed the title poc POC Jul 17, 2025
* Used to track a mapping from original expression (potentially flat-world) to canonicalized CCS expressions.
* e.g. for an original index expression `logs-*`, this would be:
* original=logs-*
* qualified=[(logs-*, _local), (my-remote:logs-*, my-remote)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're storing the remote name twice here, as I understand? What's the reason for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ya, we don't really need to -- I added it to clarify that that info will be available but in a production ready version of this, we wouldn't want to duplicate that info, you're right

/**
* Only called if cross-project rewriting (flat-world, linked project filtering) was applied
*/
void qualified(List<QualifiedExpression> qualifiedExpressions);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename this method? This is a mutator, and QualifiedExpression below has qualified() which is a getter, and it creates some congnitive load to try and figure out which one is meant each time. Maybe something like updateIndices or something if we expect it to be used to update indices?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah we for sure want a better name -- I'm also wondering if this should go in the IndicesRequest.Replaceable interface which has an indices(...) method on it

Copy link
Contributor Author

@n1v0lg n1v0lg Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require a new method like isCrossProjectAware() on the IndicesRequest.Replaceable interface -- not sure yet which approach I prefer

/**
* Can be used to determine if this should be processed in cross-project mode vs. stateful CCS.
*/
boolean crossProjectModeEnabled();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will be using this method? The javadoc describes what it does but it's unclear to me which parts needs to know that.

@@ -58,6 +64,7 @@ public final class FieldCapabilitiesRequest extends LegacyActionRequest implemen
private QueryBuilder indexFilter;
private Map<String, Object> runtimeFields = Collections.emptyMap();
private Long nowInMillis;
private List<QualifiedExpression> qualifiedExpressions;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How this is meant to be used? We store the result of flat resolution in the indices anyway, so we essentially storing it twice. Not sure I understand why?

import java.util.List;

/**
* Resolves linked and authorized projects, if running in a cross-project environment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "resolve" mean in this context?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All naming and Javadoc definitely to be massively improved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But basically it gets all the projects based on linking and authorization (and environment, i.e., somewhere the returns VOID).

}
}

record ResolvedProjects(List<String> projects) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this class and not just a list and VOID being the empty list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty list would mean "no access to any project"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List.of("_local") -> just local, CPS enabled
List.of() -> no access (I do wonder how this would happen but it since it happens early on during authz code)
VOID -> this is not even a CPS-enabled environment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the two last cases different? I mean, if you have no access to any projects, you can't do any CPS things, so why would it matter if it's the CPS environment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in the case of VOID the search should proceed but in the case of empty but not VOID, it should fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to use VOID as a thing in core to toggle CPS behavior (and failing if a user has no access to any projects -- i.e. empty list -- is itself a CPS behavior)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting question - if the user doesn't have an access to any projects, would we even get this far? Right now _query for example would reject it but we could change that. What exactly minimal permissions are needed for CPS-enabled endpoints? Is anybody allowed in case they have access to some other remotes? Are there any separate permissions for this scenario?
I assumed the user would always have access to local, and this is the case today - you can't run _query without local access, but this is considered a bug, so my assumption was wrong. But I am still wondering if having user that has no access to anything get this far and not being stopped by security layer earlier makes sense?

import java.util.Map;
import java.util.Objects;
import java.util.Set;

public final class FieldCapabilitiesRequest extends LegacyActionRequest implements IndicesRequest.Replaceable, ToXContentObject {
public final class FieldCapabilitiesRequest extends LegacyActionRequest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some way to tell field caps not to do the rewrite, because when processing e.g. lookups we probably don't need that. Alternatively, we'd need to change lookup code so it qualifies all the indices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think that's true but I want to think through how that actually looks, I don't know if we want just a boolean flag

return indices.toArray(String[]::new);
}

public static List<String> expressions(String indexPattern) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to explore what's the deal with "original" indices and what is actually expected there - real original or with flat syntax resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>non-issue serverless-linked Added by automation, don't add manually v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants