Skip to content

Backward Compatibility: Parallel Branch does not support direct use of states definitions #413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
yzhao244 opened this issue Jun 21, 2021 · 14 comments
Assignees
Milestone

Comments

@yzhao244
Copy link
Contributor

What is the question:

Hi, I have this issue opened and would like to discuss the spec regarding parallel branch does not support direct use of states definitions when comparing spec 0.1.

If my understanding is correct, currently, the parallel state branch supports actions definition which can use subFlowRef to support a more complex flow such as switch and so on.

However, using subFlowRef in practice requires that end-user has to successfully define a subflow first for getting a workflowId. After obtaining the workflow ID, end-user can start to use the ID to define the parallel state. I think it brings more complexity if end-user only builds a simple parallel workflow which branch only contains a switch, or a operation state and end-user may not need a subFlowRef ID for reusable purpose.

I am sorry to bring back the example of AWS Step function as my reference lol, which allows directly defining states in branch. I feel our specs and theirs have things in common lol..

What would you like to be added:
Is it okay to add back the states array to branch of parallel state definition.

Why is this needed:

  1. It ensures backward compatibility.
  2. It is straightforward to build workflow which allows end-user to directly define whatever the states in branch
@tsurdilo
Copy link
Contributor

I can definitely understand your point of view on this.
The changes from v0.1 for this were:

  1. We first removed the ability to use states in branches and allowed for either actions or subflow state only.
    The reason for this was that you can build very complex nested structures like that, for example you can nest parallel states inside parallel states and define layers of error handling...using SubFlow states at that point seems as a reasonable alternative as runtime can look up the control flow logic via id and also you can make it reusable across multiple branches
  2. We then removed SubFlow State and made it into an action...that to me is something we should look at again honestly.
    I think that making it into an action did bring some benefits but it introduced some others (like forced looping via control flow logic and not the user friendly "repeat" we had on subflow states).

Can you show me an example of ASL where you find the use of states inside branches useful ?

@yzhao244
Copy link
Contributor Author

yzhao244 commented Jun 28, 2021

The following example is from Amazon step function where end-user can directly define states in branch. It is a very simple parallel state use case which has no nested structures and end-user does not require to get a subflow id when building a simple workflow... However, building the same workflow in current spec, it requires end-user to build subflow first and get an ID. After obtaining a subflow Id, end-user can start to define parallel state.

image
The parallel state should look like this

"MyParallelState": {
  "Type": "Parallel",
  "InputPath": "$",
  "OutputPath": "$",
  "ResultPath": "$.ParallelResultPath",
  "Next": "SetCartCompleteStatusState",
  "Branches": [
    {
      "StartAt": "UpdateMonthlyUsageState",
      "States": {
        "UpdateMonthlyUsageState": {
          "Type": "Task",
          "InputPath": "$",
          "OutputPath": "$",
          "ResultPath": "$.UpdateMonthlyUsageResultPath",
          "Resource": "LambdaARN",
          "Retry": [ {
            "ErrorEquals": ["HandledError"],
            "IntervalSeconds": 1,
            "MaxAttempts": 2,
            "BackoffRate": 2.0
         } 
          "End": true
        }
      }
    },
    {
      "StartAt": "QueueTaxInvoiceState",
      "States": {
        "QueueTaxInvoiceState": {
          "Type": "Task",
          "InputPath": "$",
          "OutputPath": "$",
          "ResultPath": "$.QueueTaxInvoiceResultPath",
          "Resource": "LambdaARN",
          "Retry": [ {
            "ErrorEquals": ["HandledError"],
            "IntervalSeconds": 1,
            "MaxAttempts": 2,
            "BackoffRate": 2.0
         } 
          "End": true
        }
      }
    }

@tsurdilo
Copy link
Contributor

tsurdilo commented Jun 28, 2021

@yzhao244 thanks for the example!
May I offer a counter-point :)
If you look UpdateMonthlyUsageState and QueueTaxInvoiceState they are just actions, meaning they just invoke a service.
For this example using our DSL actions makes more sense.
This is the exact same as if you had in each branch an operation state with a single action :) so using actions in this case (and not dummy operation states) not only seems more intuitive, but also reduces the number of lines of json/yaml you have to write.

I was looking for an example where there is actual control-flow logic inside each branches. Off of your example I would actually argue that what we are doing atm is indeed better :)

@cdavernas
Copy link
Member

cdavernas commented Jul 3, 2021

@tsurdilo though I mostly agree with your last comment, I think @yzhao244 has a point, too. What if I want to execute two switch cases in parallel? Let's imagine you have a package, and you want to publish it in parallel in both staging and prod. The publishing processes would switch on the package type to determine on which app you actually want to publish it. AFAIK, this cannot be done now without subflows or complex jq expressions.

Now, I've had a similar concern with my a client, and we first ended using very complex jq expressions to do so, until we realized it was extremely hard to read and understand with no advanced knowledge of jq and ended up using a sublow, for a single condition check, basically.

In my opinion, we should think of a way to do so. Probably not using sub-states, as it looks confusing and goes against the actual state concept, but maybe by adding, for example, a condition property on actions.

@yzhao244 @tsurdilo WDYT?

@tsurdilo
Copy link
Contributor

tsurdilo commented Jul 3, 2021 via email

@yzhao244
Copy link
Contributor Author

@tsurdilo @cdavernas Thanks for the replies, guys. :) . Actually, the above AWS DSL example allows UI Console to easily render a graphical model in real-time since it allows end-user directly defines control logic which I believe Google Workflow, Ali Serverless Workflow both allow explicit control flow as well for the purpose of easily rendering graphical model.

In our spec, since using subFlowRef in practice requires a workflowId, it requires that end-user has to create subFlow first and then gets a workflowId. Finally, end-user has to put that workflowId in the branches. My personal feeling is a bit complex process and makes UI render graphical model a bit more complicated comparing to allowing explicit control flow.

@tsurdilo I looked at your counter-point... it is a very good point of using actions lol.. However, I updated my example with adding retries which our actions do not support retry. :) .. Actually, I totally agree that for better maintenance for large flows, definitely, using subworkflow ID is way better than explicit control flow logic. However, allowing explicit control flow logic in branches shouldn't be a problem because it sounds like a common practice that end-user would want to do. :)

Is it possible to support both explicit control flow in branches and subworkflowId which end-user can choose for better maintenance for large flows.

@yzhao244
Copy link
Contributor Author

yzhao244 commented Aug 4, 2021

@tsurdilo @cdavernas For better supporting explicit control flow, I am thinking to set a limit on number of nested layers if end-user chose to define explicit control flow in branches so that our spec can ensure backward compatibility of supporting explicit flow. The following is a suggested example spec of Branch definition for supporting both actions and states definitions. :)

Parallel State: Branch

Parameter Description Type Required
name Branch name string yes
actions Actions to be executed in this branch array no
maxLayers Limits to max number of nested layers integer no
states States to be executed in this branch array no
timeouts Branch specific timeout settings object no

@tsurdilo
Copy link
Contributor

tsurdilo commented Aug 4, 2021

@yzhao244

I updated my example with adding retries which our actions do not support retry. :)

they do now (will be hopefully added to 0.7 release :) ) - #435

For better supporting explicit control flow, I am thinking to set a limit on number of nested layers

Overall I do think that supporting both is in the end what we want to do. The way I'd like to do it however is a little different for 3 reasons:

  1. states inside a branch have to conform to the "workflow control flow logic" meaning they have to have a starting state and define one or more end definitions.
  2. states inside the branches should not be able to be transitioned to from a) other branches in parallel state b) other states in the main workflow control flow logic.
  3. states inside a branch should not be able to transition to 1) other branch states b) other core workflow states

I would like to create a "grouping" of states, with preferably a lable/id attached to them. From the parallel state branch then you can then simply reference this group of states instead of having them hard-coded.
States in the core workflow definition would all be in the "main" group. And for tooling then it would also make it easier as then validation can check if a state in the "main" group tries to transition to a state which is not in that same group...and raise validation error(s) and vice versa.

WDYT?

@tsurdilo tsurdilo added this to the v0.8 milestone Aug 18, 2021
@yzhao244
Copy link
Contributor Author

@qjl1988

@yzhao244
Copy link
Contributor Author

@tsurdilo I read the ActionRetries in 0.7 release. Good Work :) ..
"grouping" of state sounds an interesting idea. However, since parallel branch state still needs to reference the group, then how is it different from referencing to a subworkflow Id in current spec. Maybe it is just me lol having difficulty of visualizing grouping of states or maybe I misunderstood something here. lol

@tsurdilo
Copy link
Contributor

@yzhao244 yeah, we can/should definitely talk about it to figure it out. Let's plan on that.

@yzhao244
Copy link
Contributor Author

yzhao244 commented Sep 2, 2021

@tsurdilo created this one on discussion board :) . #466

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@cdavernas
Copy link
Member

cdavernas commented May 17, 2024

Closed as resolved by 1.0.0-alpha1, and therefore as part of #843

In 1.0.0-alpha1, the concepts of states and actions are replaced by tasks, which can define subtasks.

You can therefore do the something like:

document:
  dsl: 0.10
  namespace: default
  name: amazon-parallel-workflow-equivalency
do:
  myParallelState:
    execute:
      concurrently:
        updateMonthlyUsage: { ... }
        queueTask:
          execute:
            sequentially:
              invoice: { ... }
              email: { ... }

@github-project-automation github-project-automation bot moved this from Todo to Done in Progress Tracker May 17, 2024
This was referenced May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants