Add provenance output support to execute() response #768

hapix · 2025-05-05T08:27:52Z

Summary

This PR adds support for returning workflow provenance data as part of the execute() result in the OpenEO Python client. It complements the provenance generation added in openeo-pg-parser-networkx via the yProv4WFS library.

Key Changes

Added optional provenance output to execute() result structure

Dependencies

This PR depends on the provenance functionality in openeo-pg-parser-networkx.

soxofaan · 2025-05-05T14:52:30Z

openeo/local/connection.py

@@ -2,6 +2,8 @@
 import logging
 from pathlib import Path
 from typing import Callable, Dict, List, Optional, Union
+import os


any reason this is added?

soxofaan · 2025-05-05T14:55:24Z

openeo/local/connection.py

+
+        # Return the result and get the workflow provinance (yprov4wfs)
+        result = pg.to_callable(PROCESS_REGISTRY)()
+        workflow = pg.workflow


as far as I understand this depends on a new feature of openeo-pg-parser-networkx, so the minimum version of this dependency has to be bumped at

openeo-python-client/setup.py

Lines 47 to 52 in ee18290

localprocessing_require = [

"rioxarray>=0.13.0",

"pyproj",

"openeo_pg_parser_networkx>=2023.5.1",

"openeo_processes_dask[implementations]>=2023.7.1",

]

soxofaan · 2025-05-05T14:57:59Z

openeo/local/connection.py

+        workflow = pg.workflow
+
+        # To save the provenance file in the specific path use:
+        # workflow.prov_to_json(directory_path=save_path)


I don't think it's useful to have this as comment here. If this is for users, it should be in the docblock

soxofaan · 2025-05-05T15:01:23Z

openeo/local/connection.py

+        # workflow.prov_to_json(directory_path=save_path)
+
+        if return_provenance:
+            return result, workflow


This custom return should be documented in the docblock and return annotation

that being said, I'm not a big fan of the pattern of returning different data structures (tuple of two things instead of a single DataArray) depending on input arguments.
Especially, because in normal usage of the openEO python client, the execute method of connection objects (LocallConnection here) is usually not used directly by users, but indirectly through DataCube.execute() or something equivalent. So changing the input and output API of Connection.execute is going to create problems

soxofaan · 2025-05-05T15:06:24Z

openeo/local/connection.py

@@ -270,6 +272,7 @@ def execute(
        *,
        validate: Optional[bool] = None,
        auto_decode: bool = True,
+        return_provenance: bool = False,


You are adding a custom argument here to a more public API of Connection.execute(), which is fine as long as you call this LocalConnection.execute() yourself. But in general this method will be called automatically without this argument (because it's not in the official API), e.g.:

cube = local_connection.load_collection(...) res = cube.execute()

the latter execute() is a method defined on DataCube and does not support return_provenance, let alone it will pass it properly to LocalConnection.execute

Add provenance output support to execute() response

8121707

soxofaan reviewed May 5, 2025

View reviewed changes

hapix mentioned this pull request May 6, 2025

Integrate yprov4wfs into OpenEO (openeo_pg_parser_networkx) Open-EO/openeo-pg-parser-networkx#97

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add provenance output support to execute() response #768

Add provenance output support to execute() response #768

hapix commented May 5, 2025 •

edited

Loading

Uh oh!

soxofaan May 5, 2025

Uh oh!

soxofaan May 5, 2025

Uh oh!

soxofaan May 5, 2025

Uh oh!

soxofaan May 5, 2025

Uh oh!

soxofaan May 5, 2025

Uh oh!

soxofaan May 5, 2025

Uh oh!

Uh oh!

	localprocessing_require = [
	"rioxarray>=0.13.0",
	"pyproj",
	"openeo_pg_parser_networkx>=2023.5.1",
	"openeo_processes_dask[implementations]>=2023.7.1",
	]

Add provenance output support to execute() response #768

Are you sure you want to change the base?

Add provenance output support to execute() response #768

Conversation

hapix commented May 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Dependencies

Uh oh!

soxofaan May 5, 2025

Choose a reason for hiding this comment

Uh oh!

soxofaan May 5, 2025

Choose a reason for hiding this comment

Uh oh!

soxofaan May 5, 2025

Choose a reason for hiding this comment

Uh oh!

soxofaan May 5, 2025

Choose a reason for hiding this comment

Uh oh!

soxofaan May 5, 2025

Choose a reason for hiding this comment

Uh oh!

soxofaan May 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hapix commented May 5, 2025 •

edited

Loading