-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Enzo-E's foundations and Cello's overall architecture are fantastic! It does a lot of things extremely well, has a very modular design and provides a lot of conveniences! (everything @jobordner has done is truly fantastic!) I've only come to appreciate the overall design more with time (especially now that I've spent some time working with other codes).
Enzo-E's Method
objects are arguably one of its most defining features. They have a very logical structure (initialization happens in the constructor, Block-data is updated by compute
, timestep any requirements are computed/reported by timestep
) that is extremely approachable (and you generally know where to look). It provides extremely useful conveniences (like the refresh-machinery). Plus it's easy to switch the machinery on and off.
In my opinion, the one area that needs improvement is Method-initialization.
About Method Initialization
Currently, there is a one-to-one mapping between the Method
objects initialized by Enzo-E and the contents of the parameter files. In more detail, the precise Method
classes as well as their explicit execution order are specified in the parameter-file. The logic that actually initializes the methods is performed by the Problem
class.
Before I go further, let me emphasize that this is a very sensible choice (I probably would have done the same thing)! Plus, it has definitely served us well! Some of the benefits from this approach include:
-
What you see (in the parameter file) is what you get in the simulation. There are basically no surprises about the order of operations.
For example, imagine
Method:list
is assigned["order_morton", "balance", "mhd_vlct", "grackle"]
. The control flow executed each cycle looks like
-------------- --------- ---------- ---------
| | | | | | | |
| order_morton | -> | balance | -> | mhd_vlct | -> | grackle |
| | | | | | | |
-------------- --------- ---------- ---------
- You are able to freely adjust the order of operations at runtime. This is extremely useful for debugging. It's very easy to insert a method to show/save state at any point.
- The logic in the codebase for initializing methods is extremely straight-forward.
- This initialization-logic is consistent with the initialization logic for other entities in the codebase. (And we happen)
Importantly, the challenges with this approach are not easy to foresee, and only really become problematic "at scale" (i.e. as more Method
classes are introduced and more people designing the Method
classes).
Challenges with the current approach
In recent years, the following factors/challenges have become apparent:
- A number of different actions/physics need to be implemented by separate a sequence of
Method
instances. When you have a long list of methods (8? 10?) it is easy to forget about them. - Relatedly, there are all sorts of ordering requirements and conventions for different methods
"order_morton"
(or"order_hilbert"
) should come before"balance"
"balance"
can't occur at the end of a cycle. It should generally come near the start"order_morton"
(or"order_hilbert"
) should come before"check"
- soon-to-be-added
"frame_transform"
should be the last thing "gravity"
related:- I think it should come after
"pm_deposit"
but before"pm_update"
- should come before your hydro method. But I think
"pm_update"
comes after - I think it should come before
"background_acceleration"
- are there any ordering requirements related to
"comoving_expansion"
? Does it come after"pm_update"
- when should particle-creation methods happen? (after
"pm_update"
?)
- I think it should come after
"m1_closure"
should come after hydro, but before"grackle"
- When should
"feedback"
occur in relation to"star_maker"
, hydro,"m1_closure"
- When should
"accretion"
occur in relation to"sink_maker"
, hydro? "flux_correct"
must follow hydro
- The system was designed with the expectation that we would know all of the fields before initializing the first
Method
instance. While the system was modified a few years back to let you specify the required fields in constructors, the modified system doesn't work particularly well.- It presents a challenge for refresh lists. Since methods like
"grackle"
usually get initialized after the hydro-solver, the hydro-solver can't know all of the required passive-scalar fields that must be refreshed - A similar issue arises for the flux-correction (see issue #?)
- A method might like to preallocate scratch-space in the constructor, but may need to wait to lazily initialize this space until the first
compute
ortimestep
invocation (from a code-design perspective, that's kinda ugly. In the abs, we want to encourage people not to mutate state after initialization) - The
"output"
method would also like to know all of the fields when it is first initialized to report errors early about desired output fields (rather than the first time it writes an output) - We could also facilitate some useful optimizations if we knew the fields when we initialize the
"physics"
objects
- It presents a challenge for refresh lists. Since methods like
- Ensuring proper scheduling of related methods
- ordinary scheduling (like
"order_morton"
and"balance"
) - subcycling sets of methods together (like
"m1_closure"
and"grackle"
)
- ordinary scheduling (like
- We want to inject certain functionality that only needs to be injected in certain cases. If using dual-energy formalism, should synchronize energy after a
"flux_correct"
. To implement divergence cleaning, we would also need to inject a step after a"flux_correct"
. - The system doesn't work well with higher-order time integration. It essentially assumes that all methods as a whole use forward-euler time integration. Individual methods are free to implement higher-order time integrate (e.g. see
mhd_vlct
), but that doesn't map well across methods. We could accomplish this Involves lots of extra steps - Swapping out equivalent methods (e.g.
ppm
formhd_vlct
... ) - The need for coordinated/consistent parameters for different methods. We have worked around this with
Physics
objects, but may not always be best solution. - Methods are not very composable.
- There have been times where it would be great to call one method inside another (note: this is probably not necessary if other issues are addressed).
- It would also be nice to group various source-term methods together
Importantly, most of the burden for points 1 and 2 falls on the end-user. And it's really easy to screw this up without getting any indication from Enzo-E. As the codebase grows, it will be impossible to comprehensively check for this!
- you also may need to inspect some internal-state of how a neighboring method is configured
- (can't comprehensively check in constructors since subsequent methods not initialized yet).
- Plus, you the list of explicit checks will just continue to grow with time
How do we address this?
Honestly, I don't have all of the answers (I'm making this issue to solicit feedback). But, I have some ideas for the first few steps.
As a first step, I propose that we decouple Method initialization from the Problem class:
-
make
Problem
store a callback function, provided by the enzo-layer, maybe with the signature. (we may be able to avoid passingConfig&
)typedef void application_setup_callback(Problem&, Config&);
-
we define a function with this signature in the Enzo-Layer (maybe called
enzoe_setup_callback
).- The current contents of
Problem::initialize_method
would be duplicated here - we would also extract the current contents of
EnzoProblem::create_method_
(probably putting it in a helper function) - As the function initialized Method objects, it would register them with the
Problem
- The current contents of
-
we would register this callback inside
EnzoProblem
-
define a new method of
Problem
, maybe calledProblem::call_application_setup_callback(Config&)
and have Simulation::initialize call this method (instead of callingProblem::initialize_method
). -
Aside: technically, we could make
enzoe_setup_callback
a method ofEnzoProblem
, but I think the separation might be good (since the logic will eventually start deviating from the basic formula thatProblem
's otherinitialize_*
methods conform to)
After that, I have a few ideas (to be tackled in subsequent PRs):
- we would modify the logic in
enzoe_setup_callback
in order to ensure that methods are initialized in the proper order.- We would eventually disregard the order of methods specified by the
Method:list
parameter and build in logic to initialize the methods in a pre-defined order - we could also add logic to introduce missing methods (like a missing
pm_deposit
or a missingflux_correct
) - I have some thoughts about how to do this, but the detail's are beyond the scope of this issue
- We would eventually disregard the order of methods specified by the
- we would add logic to determine the list of required fields before calling the constructors of any Method instance (again, I have some thoughts on how to do this in a standardized manner)
- we could eventually eliminate
Problem::initialize_physics
and makeenzoe_setup_callback
responsible for this (this would facilitate some EOS-related optimizations and simplifications)
I very much welcome feedback. I have other ideas too.
The goal here is to pursue a solution somewhere between what Athena++ does for initializing tasks and what Enzo-E currently does. (I would prefer something a little more readable than Athena++)
What do we lose?
We lose some clarity about the order of methods?. We can work around this by passing a command line argument to Enzo-E that tells it to just print the methods.
We lose some "flexibility" of arbitrary reordering. In really, the only case where we currently have that kind of flexibility for any production sims is "output"
dumps. In science runs, you probably want to standardize the locations (put @ start of cycle or end). This is only really a loss for scheduling debugging methods at runtime.
- If this is a really big concern, we could probably work around this! If we implement functionality to print the list of methods, (with associated ordering), we could add command-line args to inject these methods based on that list.