Skip to content

UDP support in a federation context? #176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
soxofaan opened this issue Mar 6, 2025 · 4 comments
Open

UDP support in a federation context? #176

soxofaan opened this issue Mar 6, 2025 · 4 comments
Labels
API architecture federation design help wanted Extra attention is needed question Further information is requested technical debt

Comments

@soxofaan
Copy link
Member

soxofaan commented Mar 6, 2025

(This ticket is a manual copy of https://github.com/openEOPlatform/architecture-docs/issues/368 from Sept 2023, which is too hidden in a private repo)

This is follow-up ticket for #90 where we added initial UDP support in the aggregator: all UDP requests are just forwarded to VITO. This means that UDPs can currently only be used in process graphs that are executed on VITO back-end, because other backends will not know how to resolve the UDP references.

The problem is now how to take the next step and support UDPs in a real federated manner. This is not trivial as already hinted in #90 (comment)

This probably needs some discussion and maybe some additional specifications/conventions to get it working.

@soxofaan
Copy link
Member Author

soxofaan commented Mar 6, 2025

To set the stage and initiate discussion, some possible next steps and approaches:

  • Full "multicasting" of all UDP CRUD operations to all upstream back-ends. While this is probably the most logical option, it is not trivial as this requires a lot of synchronization (read/write) logic at the level of the aggregator to manage the distributed UDP storage, for example:
    • how to handle failures and inconsistencies? (e.g. after downtime of certain upstream back-end)
    • how to verify consistency across multiple UDP storage sources? And related: how to determine source of truth?
    • how to handle conflicts with existing UDPs on a back-end
  • store UDPs in a central place (microservice that allows URL based references to UDP) and replace name based UDP references to URL based UDP references when passing a process graph upstream. This eliminates the synchronization challenges but requires all upstream back-ends to support URL based UDP references, which isn't even standardized. (FYI In VITO backend we have an implementation for this)
  • store UDPs in a central place (microservice) and the aggregator replaces UDP "calls" with "inlined" version of the UDP before passing to upstream back-end. This approach is easy for the upstream backends as they never see the UDPs. It's however not clear if "inlining" UDPs in a process graph is feasible and what pitfalls this brings.
  • extend process graph spec to allow appending UDPs to the "main" process graph, so that UDP references are fully internal in the process graph body. This would even be an interesting approach outside the federation, but it requires a non-trivial extension of the openEO API spec

@soxofaan
Copy link
Member Author

soxofaan commented Mar 6, 2025

@soxofaan soxofaan added help wanted Extra attention is needed question Further information is requested federation design architecture API technical debt labels Mar 6, 2025
@jdries
Copy link
Contributor

jdries commented Mar 10, 2025

Note that the solution where UDP's are referenced by url is probably the way forward, given that this is what we also use in ESA APEx. If it's not yet standardized, we'll want to look into that.
What about using the 'validate' call to check if a process with UDP can be executed against a given backend?

@soxofaan
Copy link
Member Author

Note that the solution where UDP's are referenced by url is probably the way forward

While I also prefer this "remote process definition" approach, I'm not talking about that kind of UDPs here,
but the openEO's CRUD sub-API to store/get/... UDPs (under /process_graphs).

Or are you suggesting to go for the solution:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API architecture federation design help wanted Extra attention is needed question Further information is requested technical debt
Projects
None yet
Development

No branches or pull requests

2 participants