-
Notifications
You must be signed in to change notification settings - Fork 72
Move some parts of btstep() into subroutines #845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move some parts of btstep() into subroutines #845
Conversation
367e53a
to
847da19
Compare
Thank you for this contribution, @theresa-cordero. There are a number of issues here with the openMP directives and some doxygen comments that need to be addressed, and some of the new routines strike me as not being quite the right granularity. I have put together a branch of MOM6 at https://github.com/Hallberg-NOAA/MOM6/tree/modularize_btstep_rebase in which I am attempting to work through these issues, and which might be the basis for some modifications to this pull request. |
Thanks Bob for the feedback and the changes on you branch. I have decided to merge in your branch and made one more additional change by renaming |
The short description of each routine is: btstep_ubt_from_layer: Calculate the zonal and meridional velocity from the 3-D velocity. btstep_find_Cor: Find the Coriolis force terms _zon and _mer. btloop_setup: Set ubt, vbt, eta, find face areas, and set valid array size at the beginning of each barotropic loop. btloop_setup_eta: A routine to set eta_pred and the running time integral of uhbt and vhbt. btloop_setup_OBCs: Setup old or previous ubt, vbt, vbt, and vhbt for different OBC options before the next step in the btstep loop. btloop_update_v: Update meridional velocity. btloop_update_u: Update zonal velocity. btstep_layer_accel: Calculate the zonal and meridional acceleration of each layer due to the barotropic calculation.
This commit builds upon the recent refactoring of MOM_barotropic.F90 by correcting the openMP directives, dealing with use_old_coriolis_bracket_bug via an optional argument to to btloop_update_v, simplifying what is done in btstep_ubt_from_layer, restoring parentheses to btstep_layer_accel that are needed for rotational symmetry with FMAs enabled and adding doxygen comments describing several arguments. All answers are bitwise identical.
Replaced btloop_setup with truncate_velocities and moved the halo updates that had been in btloop_setup out of this routine, making it more functionally targeted. Also replaced the index range arguments to btloop_update_u and btloop_update_v to simply pass the ranges that are used rather than having to reconstruct them inside these routines. Doxygen comments were added to describe several arguments to the newly added routines, and several comments were also added describing what the various calls within btstep do. All answers are bitwise identical.
Replace btloop_setup_OBCs with store_u_for_OBCS and store_v_for_OBCs for more functional targeting and simpler routines. Also because ubt_old and ubt_prev were copies of the same thing, ubt_old is no longer used in btstep while ubt_prev is set over a slightly larger loop range in store_u_for_OBCs, and similarly for vbt_old and vbt_prev. All answers are bitwise identical.
Added doxygen comments to describe all individual arguments, rather than documenting them in groups, which should correct the previous problems with failing testing. Also avoided using a separate variables wt_accel_n and wt_accel2_n in btstep. All answers are bitwise identical.
Change name of new routine btloop_setup_eta to btloop_eta_predictor.
a535e35
to
0d19340
Compare
This is more of a question than a comment. I really like the idea of using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for these changes refactoring the barotropic solver, which should make the it easier to understand.
In addition to closely inspecting these changes, to verify the correctness of this refactoring, I have re-implemented all of these changes myself (e.g., copying the previous working parts of the code into the new routines) and semi-independently come up with the same results as Theresa did. I am therefore confident that all of these changes are correct.
I can confirm that this refactor has passed all existing regression tests for the CEFI regional configurations. |
I tested this PR in The double_gyre output is a little too simple to trust, but it does not appear to be much overhead; the function is maybe 1-2% extra overhead than leaving the loops in-place. |
In this case, I do not think that the added subroutines are likely to be a significant added computational burden. These routines are within the same file, so they should be good candidates for inlining. In addition, this PR is intended to be a preliminary step towards more extensive modifications that may add new capabilities (e.g., fusing the sea-ice with the ocean barotropic solver) or running on new platforms (e.g,. GPUs). (I would trust Marshall's comment more than mine regarding the inlining; I was writing and hit send before seeing that Marshall had also responded.) |
@marshallward @Hallberg-NOAA Thanks for the testing and explaining the context. This makes sense! |
This PR has passed pipeline testing at https://gitlab.gfdl.noaa.gov/ogrp/mom6ci/MOM6/-/pipelines/26703. |
I should have thought to provide this report:
Compiler won't inline into a function larger than 2000 lines(? or something line-like...). In other words, there is no problem with the new functions, but This 2k line limit is tunable, but we should probably aim to refactor towards a number far away from this limit at some point in the future. |
Restore the parentheses to the expressions for PFv in btloop_update_v so that the model will once again respect rotational symmetry when fused-multiply-adds are enabled. These parentheses were in the corresponding expressions until PR #845 to dev/gfdl when they were inadvertently omitted. This commit changes answers (and restores rotational symmetry) when fused-multiply-adds are enabled.
Restore the parentheses to the expressions for Cor_v in btloop_update_v so that the model will once again respect rotational symmetry when fused-multiply-adds are enabled. These parentheses were in the corresponding expressions until PR #845 to dev/gfdl when they were inadvertently omitted. This commit changes answers (and restores rotational symmetry) when fused-multiply-adds are enabled.
Modularize btstep
(edited to correct the names of the new routines)
This PR moves some parts of
btstep()
into subroutines:btstep_ubt_from_layer
btstep_find_Cor
truncate_velocities
btloop_eta_predictor
store_u_for_OBCs
andstore_v_for_OBCs
btloop_update_v
btloop_update_u
btstep_layer_accel
This eliminates the duplication of the u and v update code in the bt loop, but preserves the
BT_USE_OLD_CORIOLIS_BRACKET_BUG
in the update when v is done second.The creation of these subroutines should not change answers. I have tested it in the Baltic OM4 0.25 case with
BT_USE_OLD_CORIOLIS_BRACKET_BUG
bothTrue
andFalse
.Each routine has many arguments and I have tried to correctly label them
intent(in)
orintent(inout)
as appropriate. Some variables did not have clear comments describing them in this module (ex.use_BT_cont
orjsvf
), so I have not put doxygen comments describing them in this commit. These can be added when I address the review.The short description of each routine is:
btstep_ubt_from_layer
: Calculate the zonal and meridional velocity from the 3-D velocity.btstep_find_Cor
: Find the Coriolis force terms _zon and _mer.truncate_velocities
: Do a CFL-based truncation of any excessively large batotropic velocities.btloop_eta_predictor
: A routine to set eta_pred and the running time integral of uhbt and vhbt.store_[uv]_for_OBCs
: Store the existing velocities and transports for use with open boundary conditions.btloop_update_v
: Update meridional velocity.btloop_update_u
: Update zonal velocity.btstep_layer_accel
: Calculate the zonal and meridional acceleration of each layer due to the barotropic calculation.