-
Notifications
You must be signed in to change notification settings - Fork 262
GFDL to main (2025-07-21) #1668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Fix bug in Kd_itides allocation check for VBF itides diagnostic * Set VBF tidal mixing outputs to work with Polzin option in addition to Simmons option --------- Co-authored-by: brandon.reichl <brandon.reichl@noaa.gov>
Added additional logic to avoid segmentation faults from applying Flather open boundary conditions near the opposite side of the computational domain (e.g., applying a Northern Flather boundary condition on the southern edge of the domain). This situation does not arise when open boundaries are only found on the edges of a domain, but in parallel cases with sloped OBCs (such as the ESMG-configs rotated seamount test case) this atypical situation can arise. The key point to the specific fix is to note that when such a case arises, the OBC velocity in the outer halo point does not subsequently impact the solution on that PE either because its effects are masked out by land or OBC points at the neighboring orthogonal velocities, and then they are overwritten by a halo update (perhaps several steps later). All answers are bitwise identical in all cases that have been tested and were working before, and the ESMG/rotated_seamount test case is now running correctly (and giving identical answers) for a variety of PE counts.
Refactored the shelfwave OBC code to correct a number of minor problems. It corrects some dimensionally inconsistent expressions. It adds a rescaling factor to ensure that the solutions are mathematically equivalent for various units used to describe the Cartesian horizontal grid, and adds a fatal error if the grid is not Cartesian. It eliminates several redundant hard-coded dimensional values used to initialize the control structure. It eliminates elements from the control structure for this type that are not reused after initialization. Finally, it adds the new runtime parameter SHELFWAVE_CORRECT_AMPLITUDE that causes the inflow velocity amplitude to actually match the value specified in SHELFWAVE_AMPLITUDE. When SHELFWAVE_CORRECT_AMPLITUDE is true, it alters the default value of SHELFWAVE_AMPLITUDE to give roughly similar amplitudes of the flow to those with the default value when it is false. By default solutions are unchanged, but there is a new runtime parameter.
Keep 5 recently added area integrated surface mass or heat flux diagnostics in forcing_diagnostics in rescaled units by replacing an unscale argument with a tmp_scale argument to global_area_integral. Also added the corresponding conversion arguments to the register_scalar_field calls for each of these diagnostics, so that the documented units and conversion factors can be used to confirm the correctness of the documented units for these variables. The appropriate internal temporary variables for the integrated heat or mass fluxes were also used for clarity about the units of these variables. Benign `conversion=1.0` arguments were added to another 8 register_diag_feild calls for nondimensional forcing variables. All answers and diagnostics are bitwise identical.
Corrected the descriptions of the units of two I_rho variables in find_ustar_fluxes() and find_ustar_mech_forcing() and of the lensponge internal variable in RGC_initialize_sponges(). Also corrected the comments describing the locations of grid%geolatT and grid%geolonT. Also refactored dumbbell_initialize_thickness() to use geoLonT or geoLatT directly rather than setting a temporary variable with ill-defined units that is only used once. Only changes in comments or very simple code logic changes are included in this commit and all answers are bitwise identical.
Replaced 14 scale arguments in recently added checksum calls with the preferred unscale argument, following the pattern used elsewhere throughout the MOM6 code. These two arguments are equivalent, but unscale makes it easier to identify implausible combinations of scaling factors. All answers are bitwise identical.
* Try again at fix_obc_maskingdepth patch - "git rebase" made a conflicted mess * Fix for even number of OBC turns. - not that turns other than 0, 1 is supported elsewhere for OBCs. This still has issue #5 from a comment in #752: Some experimentation with the rotate_OBC subroutines. Dr Neumann's test uses boundary data of the value=const type. Copying the buffer_dst from OBC_in to CS%OBC gets some of these across, also the tracer reservoir values. However, the tracer reservoir values get overwritten by an interior tracer value between the first call to step_MOM_dynamics and the second. * Trying to fix internal OBC rotations, not sure * Fixing next round of rotated OBC issues * Fix up some logic for turns = 2 or 3. - Note that this is still not supported, as specificed in MOM.F90. * Adding back in the deallocate on some OBC arrays
…p/salt - Change fatal error to warning with Roquet EOS and linear TFreeze. - Add flexibility for TFreeze and different definition of temp/salt from model - If the model is absolute salinity, this allows one to convert the salinity to practical salinity before calling the freezing point subroutine. - If the model is conservative temperature but the freezing point subroutine returns the potential temperature, this allows one to convert the freezing point to conservative temperature. - The new flags are TFREEZE_S_IS_PRACS = True if TFreeze expects practical salinity (default is false for TEOS10 or TEOS_POLY TFREEZE_FORM, otherwise it is true) and TFREEZE_T_IS_POTT = True if TFreeze returns a potential temperature (default is flase for TEOS10 or TEOS_POLY TFREEZE_FORM, otherwise it is true). - The EOS type now stores the use_conT_absS flag to make these checks easier. --------- Co-authored-by: brandon.reichl <brandon.reichl@noaa.gov>
This commit is part 1/2 of enabling user specified compressibility in linear equation of state. An attribute drho_dp is added in extended type linear_EOS and various calculations for density, specific volume and their first order derivatives. But the full compressibility is not yet added in density and specific volume integrals. * Two recently added runtime parameters (TREF and SREF) for linear EOS are renamed to T_REF_LINEAR_EOS and S_REF_LINEAR_EOS, respectively. The old names could be confused with T_REF and S_REF, which are used to initialize the model's temperature and salinity in certain methods. RHO_TREF_SREF is also renamed to RHO_REF_LINEAR_EOS. * Description of parameter RHO_T0_S0 is modified for clarification. * Two new runtime parameters are added (DRHO_DP and P_REF_LINEAR_EOS). * `Compressible` attribute of linear_EOS type is deterimined by whether DRHO_DP is zero or not. (At the moment, the attribute is only used to prevent user from using Montgomery PGF when EOS is compressible.)
This commit is part 2/2 of enabling user specified compressibility in linear equation of state. Terms with DRHO_DP are added in density and specific volume integrals. Answers with DRHO_DP=0 are not changed.
Add an optional logunit argument to the 3 simple chksum() routines, or write the checksums to error_unit if it is absent, following the pattern in all of the other MOM6 checksum routines. All solutions are bitwise identical, but there is a new optional argument to a publicly visible interface.
Added 4 new diagnostics (SSH_u_OBC, SSH_v_OBC, ubt_OBC and vbt_OBC) to diagnose the outer barotropic forcing forcing fields at Flather open boundary condition points. Also added 4 paired checksum calls for 8 OBC-related 2d arrays when OBCs are in use and DEBUG is true. All answers are bitwise identical but there are new diagnostics in the available_diags files.
Rather than explicitly list known unit tests, we now either wild card or use "find" to create a list.
- Using wild cards to catch all the drivers/timing/* executables - Run these tests on a push, but not to compare to target - Enabled artifacts for target opt executables - Had to re-clone target codebase in the compare-timings job - Also had to use the target_codebase/.testing/Makefile directly to avoid rebuilding executables
Fixed a bug that was causing some open boundary condition fields (e.g., those related to the tides) not to be read in from files before they are used unless there are some PEs that do NOT include any input date based OBC points. This bug causes certain configurations with not to reproduce across PE count. Most high-PE count jobs (those with at least 3 PEs in each direction) previously were correct, but single PE jobs or jobs with a prime number or 2 times a prime number of PEs were never correct for these cases. Specifically, this commit removes the some_need_no_IO_for_data element from the ocean_OBC_type, and it always calls update_OBC_segment_data() during initialization when any OBCs are being used. It also replaces three calls to allocate_rotated_array() with calls to the new internal subroutine allocate_rotated_seg_data() to avoid a bug where the size of rotated data segments is right but the index range is wrong. This commit will change low-PE count answers for some cases with certain types of open boundary conditions.
Added the new function rotate_OBC_segment_direction that returns an integer indicating the direction of an OBC segment on rotated grid, given its direction on the current grid and the number of turns by which that grid has been rotated. A negative number of turns undoes the rotation. For now this new routine is used in rotate_OBC_segment_config, but additional uses will follow with subsequent commits. All answers are bitwise identical, but there is a new publicly visible function.
Added the new publicly visible subroutines write_OBC_info() and chksum_OBC_segments() for writing verbose information for debugging open boundary conditions. Write_OBC_info() writes extensive information about the settings and the contents of arrays in the ocean_OBC_type, modifying the output so that it should be largely invariant to rotation. Chksum_OBC_segments() does a checksum on all allocated elements of the open boundary condition segments and optionally writes out a number of layers of segment data, with the order of output chosen to appear as it would if the data were for an eastern open boundary. Because write_OBC_info() is rather verbose, it should probably only be used for intensive debugging, while the nk argument to chksum_OBC_segments() helps to manage its verbosity and makes it a candidate for inclusion in more wide-spread debugging. This commit also adds scalar_pair arguments to 5 uvchksum calls for radiation and oblique open boundary conditions and tracer reservoirs. Only diagnostic debugging code is added here, and all solutions are bitwise identical, but there are two new publicly visible interfaces for writing information for debugging open boundary conditions.
Added the new runtime parameter DEBUG_OBCS that triggers verbose diagnostic output about the contents of the open boundary condition type via calls to the new routines write_OBC_info() and chksum_OBC_segments(), along with the new runtime parameter NK_OBC_DEBUG to regulate the volume of diagnostic output. It also renames the previous runtime parameter DEBUG_OBC to OBC_DEBUGGING_TESTS to trigger the code that resets the normal velocities at OBC segments to zero and the thicknesses and tangential velocities behind OBC segments to silly values for testing, with DEBUG_OBC retained for now as the old name. This latter change follows the pattern elsewhere in the code in which other DEBUG parameters do not actually change any values. This commit also eliminates a fatal error message if open boundary conditions are applied with rotations of 180 or 270 degrees. All solutions in cases that worked previously are bitwise identical, but there are new runtime parameters.
This simplifies and expands the timing/testing of reproducing sums that were in .../unit_drivers and would not work without some setup of the environment (input files). The refactored drivers are stand alone and (mostly) fit within the pattern of the other unit tests and timing tests. - Moves config_src/drivers/unit_drivers/MOM_sum_driver.F90 to config_src/drivers/timing/time_reproducing_sum.F90 - Cleaned up time_reproducing_sum to not use unnecessary infrastructure (like grids, parameter files, etc.) - Added config_src/drivers/unit_tests/test_reproducing_sum.F90 to perform property tests removed from timing test. - Added trivial unit test that reproducing_sum() gives correct analytic sum for array of precsribed values - Added new tests that reproducing_sum() results are invariant to order of values in arrays - Made MOM_define_layout() public - Initialized MOM_dom%nonblocking_updates and MOM_dom%thin_halo_updates when optional arguments are missing (bug fix) - Hacked .testing/tools/disp_timing.py to not fail on the FMS tail sheet which inevitably is generated when using FMS :(
Collected the conversion factors for the 4 salt tendency diagnostics diabatic_salt_tendency, diabatic_salt_tendency_2d, boundary_forcing_salt_tendency and boundary_forcing_heat_tendency_2d out of the calculation of the diagnostics before they are posted and into the conversion argument in their register_diag_field calls. This change simplifies the code where the diagnostics are calculated, but although all expressions are mathematically equivalent, there is a change in the order of arithmetic with which these 4 diagnostics are calculated, so answers will change at roundoff, although this is unlikely to be detected if the diagnostics are being written as 32 bit floats. All model solutions are bitwise identical.
Fixes inconsistency between layers and interfaces in MOM_internal_tides Clip the diffusivity in diagnostics used for the buoyancy fluxes
These changes enhance the existing vertical diffusivity used in EPBL with machine learned diffusivity that is constrained on a second moment closure. Using symbolic regression and least-squares fitting, a shape function ( g(\sigma) ) responds to changes in the surface forcing (Latitude, wind stress, surface buoyancy flux, boundary layer depth). g(\sigma) goes from zero to 1 and its skewness changes as per surface forcing conditions. There are two options for the turbulent velocity scale: v0 and v0h. The velocity scale, v0, is an approximation that depends on three inputs: Coriolis parameter f, ustar u_*, and surface buoyancy flux BFlux. It agrees with diagnosed velocity scale from GLS scheme given by v0 = < { \kappa } max / h >, where h is boundary layer depth and angled brackets denote averaged data for a given forcing: B, u_*, f. When v0 is multiplied with g(\sigma) and multiplied by the energetics based boundary layer depth h, i.e \nu = . g(\sigma) \cdot v_0 \cdot h, we get a diffusivity which is constrained on a second moment closure. The second velocity scale, v0h, depends on ustar u_*, surface buoyancy flux BFlux, and boundary layer depth. It agrees with diagnosed velocity scale from GLS scheme given by v0^h = { \kappa } max / h , where h is the boundary layer depth. v0^h agrees with convective velocity scaling under pure convection ( v0^h ~ (Bh)^(1/3) ). When v0 or v0^h is multiplied with g(\sigma) and multiplied by the energetics based boundary layer depth h, i.e \nu = . g(\sigma) \cdot v_0 \cdot h, we get a diffusivity which is constrained on a second moment closure. The Machine learned (ML) diffusivity is activated by using the logical flags: 1. EPBL_EQD_DIFFUSIVITY_SHAPE 2. EPBL_EQD_DIFFUSIVITY_VELOCITY 3. EPBL_EQD_DIFFUSIVITY_VELOCITY_H Logical flag 1 activates the new equation for shape function. Logical flag 2 activates velocity scale v0 Logical flag 3 activates velocity scale v0^h To use ML diffusivity, Logical flag 1 has to be set to ‘True’ and either logical flag 2 or 3 should be set to ‘True’. 2 and 3 both cannot be true or false. The subroutines replace the default EPBL mixing coefficients of Reichl and Hallberg (2018). The publications for OSBL machine learned diffusivity are: Sane et al. 2023 ( https://doi.org/10.1029/2023MS003890 ) and Sanę et al. 2025. The original version of this commit was modified by adding do_not_log arguments to some of the newly added get_param calls.
Revised the MOM_open_boundary module to better handle grid rotation for an arbitrary number of turns. Rotate_OBC_config now properly handles the setting and in some cases swapping of global information between the rotated and unrotated ocean_OBC_types, while rotate_OBC_segment_config and rotate_OBC_segment_data were modified to do the same for OBC_segment_type settings and data fields. These changes also include corrections to the code that rotates the input buffer and corrects the sign of vector components in update_OBC_segment_data when there is grid rotation. It also adds a missing scalar_pair argument to the restart registration call for the normal radiation arrays. Several comments were also corrected or expanded for greater clarity, and some instances of non-standard use of white space were corrected. All answers in non-rotated cases are bitwise identical, and some rotated cases that previously gave segmentation faults are new running, although rotation does not yet work properly for all cases with open boundary conditions.
Added the new runtime parameter OBC_HOR_INDEXING_BUG that can be set to true to retain previous answers or false to correct 5 horizontal indexing bugs that prevent solutions with open boundary conditions from satisfying rotational symmetry. Four of these bugs are related to the use of the brushcutter mode to set various terms in the open boundary conditions, while the fifth corrects the total ocean depth used to adjust the thicknessses used to remap tangential flow fields that are discretized at vorticity points. By default, all answers are bitwise identical, but there is a new runtime parameter in some MOM_parameter_doc files.
The jki loops in vertvisc() have been reordered to kji. The solver increases the number of concurrent tridiagonal solvers from Ni to Ni*Nj. Two other changes contributed to performance * Moving diagnostics (e.g. ADp%du_dt_str) outside of loops * Conditional computing of Ray() when visc%Ray_[uv] is set Not all optimizations of this sort were applied, and should be reviwed in relevant experiments. This showed a modest performance improvement on CPUs. Three instances are shown below. * mom_vert_friction_mp_vertvisc_: 0.583s, 0.629s (-7.3%) * mom_vert_friction_mp_vertvisc_: 0.576s, 0.634s (-9.2%) * mom_vert_friction_mp_vertvisc_: 0.583s, 0.636s (-8.3%) This patch uses nested do loops since we have not yet adoped do concurrent loop constructs. But a future do concurrent form shows even greater speedup, e.g. * mom_vert_friction_mp_vertvisc_: 0.258s, 0.539s (-52.2%) The work in this PR will prepare this module for porting to GPUs. Co-authored-by: Edward Yang <edward_yang_125@hotmail.com>
As with vertvisc(), this patch rewrites the vertvisc_remnant() tridiagonal solvers to run in kji order, with even greater benefits to runtime. Three instances are shown below. Speedup is about 1.3-1.4x. * mom_vert_friction_mp_vertvisc_remnant_: 0.939s, 1.241s (-24.3%) * mom_vert_friction_mp_vertvisc_remnant_: 0.935s, 1.265s (-26.0%) * mom_vert_friction_mp_vertvisc_remnant_: 0.910s, 1.258s (-27.7%) As before, only the diagnoal array (b1) was promoted to 3d. As with vertvisc() this change is expected to be highly favorable to GPU performance.
- Minor re-factor of numerical_testing_type_unit_tests() - Added a doxygen blurb for the module including usage for the helped class - Updated test_numerical_testing_type with rename of unit_test function
This commit makes a set of changes to refactor code related to open boundary conditions to correct issues with restarts. Renamed open_boundary_init to open_boundary_halo_update to reflect what it actually does and eliminated the 4 unused arguments to this routine. Moved halo updates for temperature and salinity and the call to setup_OBC_tracer_reservoirs out of fill_temp_salt_segments and made the tv argument to fill_temp_salt_segments intent in. Moved the call to setup_OBC_tracer_reservoirs and the debugging calls for OBCs into the same code block as the rest of the OBC code in MOM_initialize_state. Moved the call to open_boundary_halo_update after MOM_state_initialization for both rotated and unrotated branches. Moved the calls to fill_temp_salt_segments and setup_OBC_tracer_reservoirs and the debugging calls for OBCs into the same code block as the rest of the OBC code in MOM_initialize_state. All answers are bitwise identical, but there are changes to public interfaces and what does and does not occur in several routines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marshallward I think the links in your first post are not correct. They link to the issue/PR in this repo with the same number as the PR to NOAA-GFDL/MOM6.
Anyway, this PR looks good and does not change answers in ACCESS-OM3 configs.
Thanks @dougiesquire the links have been corrected. |
It should be noted that the reason why we are getting regression-testing failures is because we added a model time into the diagnostic checksums. In the future, this will allow us to move around the sequence of calls posting diagnostics without the automated testing thinking that there have been answer changes, but for now the change in the diagnostic checksum formatting is (erroneously) showing up as regression changes. |
As far as my testing for GMAO, this seems to be zero-diff for us at very-low-res (the resolution I can run). @sinakhani will be doing a more thorough test at higher res but I'll approve for now. |
DEALLOC_(CS%Idx2dyCu) ; DEALLOC_(CS%Idx2dyCv) | ||
DEALLOC_(CS%Idxdy2u) ; DEALLOC_(CS%Idxdy2v) | ||
DEALLOC_(CS%Ah_bg_xx) ; DEALLOC_(CS%Ah_bg_xy) | ||
if (CS%bound_Ah) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this file (and elsewhere), some but not all instances of ALLOCABLE_
, ALLOC_
, and DEALLOC_
have been replaced with their intrinsic versions: allocatable
, allocated
, and deallocate
.
Could you clarify the criteria used to determine which instances were transformed? I’m asking because we’re encountering conflicts with some of these changes, and I’d like to apply the same selection criteria consistently to transform ALLOC_
etc. to allocatable
etc. when resolving them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The static allocation macros should be used for any array that is persistent for the entire run, although one can probably find exceptions, partly due to contributors who are unclear about the rules (including myself).
In practice, I don't think anyone is running with static memory anymore. It is mainly used as a verification test, and has found the occasional bug.
I found an example of arrays in stochastic_init
that were changed from ALLOC_()
to allocate()
, possibly because they are conditionally used. Maybe that is the reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At one point, perhaps 15 years ago, the predecessor to MOM6 was about 20% more efficient when compiled and run in static memory mode, but this makes the model much less flexible for changing layouts or masking out all-land processors. More recently, we have been finding that static memory is not dramatically more efficient than dynamic memory, but it is still useful to run test cases in both static and dynamic memory mode because differences between them can be used to indicate (and track down) certain types of memory use problems. We have debated eliminating the static memory macros altogether, but on balance we still think that they do provide additional useful self-consistency checks.
The issue with the use of the static memory macros is that any arrays declared with them will always be assigned memory in static memory mode, even if they are never used. This increases the memory footprint of the MOM6, and it can reduce the size of static configurations that can be used on a single processor for rotational consistency testing, for example. We think that the preferred practice should be to restrict the use of static memory only to arrays that are almost always used in most MOM6 configurations, but avoid using static memory for arrays that are only used with a limited set of options. The changes in memory allocation with this PR follows this preferred practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks!
NOAA-GFDL#949 (Fix frazil halo update bug when unallocated) is a 1-line bugfix that should be added here before merging. |
If FRAZIL=False tv%frazil is not allocated but its halo might still be updated in post_diabatic_halo_updates, which will fail. If FRAZIL=True, no answers are changed. If FRAZIL=False, no answers from before this bug was introduced are changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the bug fix from Alan, I now have my simulation with sponge and OBC working. COAPS approves the pull request.
@awallcraft We moved your PR directly into this candidate PR branch. This will feed back into |
Thanks all, this has now been approved. |
since Alan's bug fixing is merged here, let me re-run UFS, shall be able to finish in half day |
no issue found in my fresh run (with Alan's commit being added) |
Thanks @jiandewang. I will merge this now. (Some lingering CI tests will request a manual override) |
This PR has updates for building MOM6 with changes from mom-ocean/MOM6#1668 My tests show zero-diff.
Latest PR to main. Much of the content is devoted to the open boundary conditions, but there is the usual blend of fixes and features.
Contributors: