-
Notifications
You must be signed in to change notification settings - Fork 95
Description
A clean issue for a problem with a long and convoluted history (e.g. see #4903, #4912, #5036, and most "recently" #5090).
background
Cylc famously has no barrier between cycles. This is usually a very good thing, but there are certain situations where a hard barrier would be convenient because it stands in for ALL dependence on tasks prior to the barrier.
I'll illustrate for a simple workflow with:
- startup tasks that everything else depends on (e.g. to deploy stuff), and
- shutdown tasks that should not run until everything else has finished
how we currently achieve it (1/2)
⬆️ The blue bits represent perpetual dependence on the initial tasks, to ensure that nothing runs before they finish. This is reasonably intuitive and reflects real dependence - not a workaround - but it causes problems:
- it makes an unbelievable mess of graph visualizations for real workflows
- it makes retriggering the startup graph difficult or confusing (will it "flow on" again?)
- it can cause performance issues in the cycling computations, far from the ICP (is this still true?)
The red bits represent dependence on special dummy tasks that exist purely to ensure that final-cycle tasks wait on everything else - i.e. it's a workaround.
- it is not possible to do blue thing for the final cycle - that would require
foo => bar[$] - (however, the shutdown task requirement is much less common than startup)
how we currently achieve it (2/2)
⬆️ Pragmatically, using the workaround (red) at both ends is probably better than the proper solution (blue), and the skip task run mode makes it more attractive than it used to be, but:
- it can be unpleasant for some workflows (e.g. with lots of parentless tasks at the top of a cycle)
- evidence at ESNZ shows users don't naturally think of it (all of our workflows have the blue bits)
- (and it is still a workaround that should not be necessary)
a better solution
⬆️ For our simple example, it would be better to have 3 separate graphs that each run to completion before the next one starts, thus absolving us of the need to handle nasty perpetual dependencies.
High-level considerations
The separate graphs should naturally run in sequence (the whole point of this is to put a barrier between bits of graph in certain situations where that is actually helpful) but we need to be able to re-trigger tasks from earlier graphs, e.g. to redeploy code or data mid-run.
Also, generally we should - #5090 (comment)
- support more than 3 separate graphs, and
- support cycling within each graph (and probably different cycling types) - e.g. for model spinup
We need to consider future directions and be compatible with that vision (so far as is possible - #5090 (comment) - #5090 (comment)
Do we need to support parallel running of the different graphs, after manual retriggering? - probably NOT:
- if I retrigger tasks from earlier graphs that probably requires pausing or suspending the current graph
- e.g. if I want to redeploy code that many later tasks depend on, I probably should not continue to run those later tasks during the redeployment process
- if not, then it seems a mixed task pool is not necessary
- this removes the gnarly problem of comparing different kinds of cycle point within the task pool
- instead, temporarily swap out the task pool and restore it after the retriggered other-graph tasks have completed?
- have to wait for live tasks to complete first though?
implementation ideas and considerations
Isolate the first and last cycles by dynamically manipulating the runahead limit?
- ❌ it works but it restricts multi-cycling early in the main graph, because the isolated cycle is also the first main cycle
- (REJECTED)
Special startup and shutdown cycle points? - #5090
- ✅ works with existing UI, e.g.
cylc trigger //startup/foo - ❌ does not support cycling in the startup and shutdown graphs
- 🥹 mixed task pool works fine but requires hacking cycle computations (runahead limit and more) to handle the special points
- (works but probably too restrictive)
UI - how to identify tasks from different graphs (given that special cycle point values are insufficient if we want cycling)?
- the main graph is special, others automatically get a unique task-name prefix?
- e.g.
1/startup_foo,2/spinup_bar - works with existing UIs
- clearly visible in GUI
- e.g.
- anything else would require new special options like
--graph=spinupand have a visibility problem (how do I know which graph this task in the GUI belongs to?)?