Skip to content

[Streams] Propagate the ingest pipeline access pattern flag to the ingest document #130488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jbaiera
Copy link
Member

@jbaiera jbaiera commented Jul 2, 2025

Adds the ability for an ingest document to capture the pipeline's access pattern flag when executing the pipeline on the document. The document stores the access pattern flag in a stack to allow for holding and restoring the access pattern flag when running nested pipelines.

@jbaiera jbaiera requested review from masseyke and lukewhiting July 2, 2025 19:23
@jbaiera jbaiera added >non-issue :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v9.2.0 labels Jul 2, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Jul 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

/**
* @return The access pattern for any currently executing pipelines, or null if no pipelines are in progress for this doc
*/
public IngestPipelineFieldAccessPattern getCurrentAccessPattern() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that this method isn't used above (and seems to only be used in the test)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used above yet. I'm currently using it in a couple places in the next set of changes I'm working on.

Copy link
Member

@masseyke masseyke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukewhiting lukewhiting requested a review from Copilot July 3, 2025 07:56
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces tracking of each pipeline’s access pattern within nested pipeline executions by maintaining a stack on the ingest document. Key changes include:

  • Add accessPatternStack field and related push/pop logic in IngestDocument
  • Update Pipeline class to allow mocking by removing final
  • Introduce a comprehensive nested-access-pattern test in IngestDocumentTests

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
server/src/main/java/org/elasticsearch/ingest/IngestDocument.java Added accessPatternStack, push/pop logic, and getCurrentAccessPattern()
server/src/main/java/org/elasticsearch/ingest/Pipeline.java Removed final modifier from Pipeline for testability
server/src/test/java/org/elasticsearch/ingest/IngestDocumentTests.java Added testNestedAccessPatternPropagation with Mockito captures
Comments suppressed due to low confidence (2)

server/src/main/java/org/elasticsearch/ingest/Pipeline.java:28

  • [nitpick] Removing final from Pipeline makes it extendable, which may not be intended for core pipeline behavior; verify that this aligns with the project's API design.
public class Pipeline {

server/src/test/java/org/elasticsearch/ingest/IngestDocumentTests.java:1273

  • The test uses is(empty()) but is is not statically imported; add import static org.hamcrest.Matchers.is; to avoid a compilation error.
        assertThat(document.getPipelineStack(), is(empty()));

Comment on lines +874 to +879
assert previousAccessPattern == accessPatternStack.peek()
: "Cleared access pattern from nested pipeline and found inconsistent stack state. Expected ["
+ previousAccessPattern
+ "] but found ["
+ accessPatternStack.peek()
+ "]";
Copy link
Preview

Copilot AI Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on a Java assert means this check is skipped unless assertions are enabled; consider throwing an explicit exception or using a proper validation mechanism to ensure consistency in all environments.

Suggested change
assert previousAccessPattern == accessPatternStack.peek()
: "Cleared access pattern from nested pipeline and found inconsistent stack state. Expected ["
+ previousAccessPattern
+ "] but found ["
+ accessPatternStack.peek()
+ "]";
if (previousAccessPattern != accessPatternStack.peek()) {
throw new IllegalStateException(
"Cleared access pattern from nested pipeline and found inconsistent stack state. Expected ["
+ previousAccessPattern
+ "] but found ["
+ accessPatternStack.peek()
+ "]"
);
}

Copilot uses AI. Check for mistakes.

pipeline.execute(this, (result, e) -> {
executedPipelines.remove(pipeline.getId());
accessPatternStack.poll();
Copy link
Preview

Copilot AI Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using poll() ignores the returned element and silently handles an empty stack; consider using pop() or removeFirst() for clearer intent and to fail-fast on unexpected underflow.

Suggested change
accessPatternStack.poll();
accessPatternStack.pop();

Copilot uses AI. Check for mistakes.

* involve the simulate pipeline logic. The simulate pipeline logic needs this information. Rather than making the code more
* complicated, we're just copying this over here since it does no harm.
* The executedPipelines and accessPatternStack fields are clearly execution-centric rather than data centric.
* Despite what the comment above says, we're copying it here anyway. THe reason is that this constructor is only called from
Copy link
Preview

Copilot AI Jul 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in the Javadoc: THe should be The.

Suggested change
* Despite what the comment above says, we're copying it here anyway. THe reason is that this constructor is only called from
* Despite what the comment above says, we're copying it here anyway. The reason is that this constructor is only called from

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >non-issue Team:Data Management Meta label for data/management team v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants