Skip to content

Conversation

marshallward
Copy link
Collaborator

An Earth Day PR from GFDL 🌎, containing local contributions from the last few months.

This PR is expected to report a regression. Diagnostic checksum testing now includes a model timestamp, which is used to sort the output. This will prevent false negatives from reordering of post_data calls. But it will report a regression with main, which does not have the timestamps.

Features/Diagnostics

Bugfix/Refactor

Infrastructure

Contributors:

herrwang0 and others added 30 commits November 30, 2024 15:29
Fix a bug with subroutine write_energy when using a DT<2. Otherwise,
the energy outputs are written at wrong time steps.

The reason was that time type divide is essentially a floor.
So DT/2 = 0 if DT<2.
  The subroutine compute_global_grid_integrals appeared in both the
MOM_state_initialization and MOM_shared_initialization modules, but was only
being called from the latter.   This commit removes the extra copy in
MOM_state_initialization.  It also removes some unnecessary parentheses in the
copy that is being retained, in part to facilitate the review of this commit.
All answers are bitwise identical, and no publicly visible interfaces are
altered.
In addition to REMAPPING_USE_OM4_SUBCELLS, for ALE remapping, there are
several parameters of the form XXX_REMAPPING_USE_OM4_SUBCELLS, where
XXX identifies the target, and they all currently default to True.

To simplify setting them all to False, which is recommended, the defaults
for the XXX versions is changed to the value of REMAPPING_USE_OM4_SUBCELLS.

Answers are only changed if REMAPPING_USE_OM4_SUBCELLS is set to False
and the default (now False) is used for one or more of the other parameters.
In such cases the original behaviour can be recovered by explicitly
setting the other parameters to True.
Removed two instances of `fail_if_missing=.true., default=0.` which
are contradictory: a default value is meaningless if the parameter must
be specified.

I encountered this when adding the `defaults=` option to `get_param_real_array()`.
The `default=` optional argument to get_param() only provides a uniform
value to initialize an array of reals. This commit adds the optional
`defaults=` argument that must have the same length as the `values`
argument.

I've also added a few instances of this optional argument:
 - by adding the `initialize_thickness_param()` procedure, selected by
   `THICKNESS_CONFIG = "param"`. The procedure was based on the "uniform"
   method, and uses the parameter `THICKNESS_INIT_VALUES` which defaults
   to uniform values derived from `MAXIMUM_DEPTH`
 - the setting of MLD_EN_VALS in MOM_diabatic_driver.F90 which was
   previously using a work around to set defaults to 25, 2500, 250000 J/m2.
 - two vectors of 4 values in user/user_change_diffusivity.F90

There will be some doc file changes, but no answer changes.
Two latent heat constants are imported directly from FMS, which is built
independently of MOM6.  Previously, it was a safe assumption that both
would be built with double precision, but this is no longer the case
since FMS now supports both single and double precision.  This could
cause conflicts with mixed-precision builds.

This patch converts the values from FMS-precision to MOM-precision.

Single->double should not affect reproducibility since every
single-precision number can be exactly represented in double precision.
Double->single could affect reproducibility, but this is not an issue
since MOM6 does not run in single precision.
* Inline harmonic analysis

Important bug fix:
    1) The Cholesky decomposition was operating on entries below
the main diagonal of FtF, whereas in the accumulator of FtF, only
entries along and above the main diagonal were calculated. In this
revision, I modified HA_accum_FtF so that entries below the main
diagonal are accumulated instead.
    2) In the accumulator of FtSSH, the first entry for the mean
(zero frequency) is moved out of the loop over different tidal
constituents, so that it is not accumulated multiple times within
a single time step.

* Inline harmonic analysis

Another bug fix: initial state added back to the mean state.

* Inline harmonic analysis

Minor update to HA_solver
* Tidal angular frequency has units [rad s-1]

Tidal frequencies are always angular frequencies to simplify applying
sine and cosine.  These have MKS units [rad s-1] but they are all
currently listed as [s-1].

Updated dOxygen comments for variables, e.g. [T-1 ~> s-1] becomes
[rad T-1 ~> rad s-1].  Updated get_param units. e.g. units="s-1"
becomes units="rad s-1".

No answers are changed, but the logged parameter units are different.

There are frequencies in MOM_internal_tides.F90 but these have not
been updated because they may be specified incorrectly.  They are
used as if they are [T-1] but they are calculated as 2PI/period [rad T-1].

  real, allocatable, dimension(:) :: frequency  !< The frequency of each band [T-1 ~> s-1].

  real    :: period             ! A tidal period read from namelist [T ~> s]

  ! The periods of the tidal constituents for internal tides raytracing
  call read_param(param_file, "TIDAL_PERIODS", periods)

  do fr=1,num_freq
    period = US%s_to_T*extract_real(periods, " ,", fr, 0.)
    CS%frequency(fr) = 8.0*atan(1.0)/period
  enddo

All MOM6-examples cases have INTERNAL_TIDES=False and so can't
resolve this issue.

* fixed too-long line
* add diagnostic for Kd_Work related to the Kd_add part

* also corrects some capitalization typos

* fix id init

---------

Co-authored-by: Raphael Dussin <Raphael.Dussin@noaa.gov>
Co-authored-by: Alistair Adcroft <adcroft@users.noreply.github.com>
Changed the code to issue a fatal error message when WIND_CONFIG =
"SCM_ideal_hurr", with the message including instructions on how to recover
mathematically equivalent solutions, and eliminated the subroutine
SCM_idealized_hurricane_wind_forcing() from the Idealized_hurricane module.  The
ocean_only MOM_parameter_doc files have also been modified to reflect that
"SCM_ideal_hurr" is no longer a valid setting for WIND_CONFIG.   All answers are
bitwise identical in any cases that run, but some cases may fail during
initialization with instructions on how to fix them.
  Corrected a bug that causes ePBL column to set the wrong variable and then use
an uninitialized variable when EPBL_ORIGINAL_PE_CALC is set to false.  This bug
was present when the EPBL_ORIGINAL_PE_CALC was first added on Sept. 30, 2016,
but it was not detected because only the default case with EPBL_ORIGINAL_PE_CALC
= True appears to being used or tested.  Any runs that used this code with
debugging compile options would have trapped it immediately.  This will change
answers when EPBL_ORIGINAL_PE_CALC is false.
  Added the new runtime parameter EPBL_MLD_ITER_BUG that can be set to false to
correct buggy logic that gives the wrong bounds for the next iteration  when
USE_MLD_ITERATION is true and successive guesses increase by exactly
EPBL_MLD_TOLERANCE.  By default all answers are bitwise identical, but there is
a new runtime parameter in some MOM_parameter_doc files.
  Added the ability to passively run ePBL_column twice in a diagnostic mode and
then provide diagnostics of the differences in the diffusivities and boundary
layer depths that are generated with the two options.  This is controlled by the
new runtime parameter EPBL_OPTIONS_DIFF, which is an integer that specifies
which options to change or 0 (the default) to disable this capability.
Associated with this are the new diagnostics ePBL_opt_diff_Kd_ePBL,
ePBL_opt_maxdiff_Kd_ePBL and ePBL_opt_diff_h_ML, which only are registered when
the differencing is enabled.  For now, only changes associated with the settings
of EPBL_ORIGINAL_PE_CALC and EPBL_ANSWER_DATE can be evaluated, but this list
will grow as new options are added.

  As a part of these changes, there were some other reforms to the way that
MOM_energetic_PBL handles diagnostics, with the 2-d and 3-d arrays changed from
allocatable arrays with enduring memory commitments in the energetic_PBL_CS type
into simple arrays in energetic_PBL, relying on the fact that compilers will be
smart enough to avoid actually allocating this memory when it is unused to avoid
expanding the overall memory requirements of MOM6.  A number of allocate and
deallocate calls were eliminated by these changes.

  In addition, explicit 'units=' descriptors were added to numerous
register_diag_field calls to help identify inconsistent conversion factors.

  Also, the diagnostic of the number of boundary layer depth iterations was
added to the ePBL_column_diags, which enabled the intent of the CS and G
arguments to ePBL_column to be changed to intent(in).  The unnecessary extra
logic associated with the use of the OBL_converged variable in ePBL_column was
eliminated, but the results are uneffected.

  Because the MOM_parameter_doc files were changing anyway, and because this
commit is only a diagnostic change, about 6 spelling errors were corrected in
ePBL parameter descriptions as a part of this commit.

  All answers are bitwise identical, but there is a new runtime parameter and
there may be new diagnostics depending on the choice of a non-default value for
that new parameter.
  Added the new internal subroutine find_Kd_from_PE_chg inside of the
MOM_energetic_PBL module to directly calculate an increment in the diapycnal
diffusivity from an energy input.  This can be used when ePBL does not convert
released mean kinetic energy into turbulent kinetic energy (i.e., if
MKE_TO_TKE_EFFIC = 0.) and is more efficient than the more general iterative
approach.  To preserve old answers, this new option is only enabled for the
surface boundary layer when the new runtime parameter DIRECT_EPBL_MIXING_CALC is
set to true.  This new option can be tested passively by setting
EPBL_OPTIONS_DIFF to 3 in a run that uses ePBL.  By default all answers are
bitwise identical, but there is a new runtime parameter in some of the
MOM_parameter_doc files.
  Add the option to do energetically consistent bottom boundary layer mixing
with the new routine ePBL_BBL_column.  ePBL_BBL_column is closely based on the
surface-focused ePBL mixing in ePBL_column, but without adding convective
instability driven mixing or mean-TKE driven mixing to avoid possible
double-counting.  This new option is enabled by setting the new runtime
parameter EPBL_BBL_EFFIC to be positive.

  If both EPBL_BBL_EFFIC and BBL_EFFIC are set to positive values, there is a
risk of double-counting, but this case is not being trapped for now.

  The changes include the addition of a new mandatory vertvisc_type argument to
the publicly visible routine energetic_PBL.

  When this new ePBL bottom boundary layer mixing option is enabled, there are
several new diagnostics available that are related to bottom boundary layer
mixing.  Several new checksum calls were also added with this new option when
DEBUG = True. The MOM_parameter_doc files are altered by the addition of two new
runtime parameters, and by the correction of several spelling errors in the
descriptions of other ePBL parameters.  By default, all answers are bitwise
identical.
  Added the ability to modify the bottom boundary layer TKE budget to account
for an exponential decay of TKE away from the boundary and the fact that when
the diffusivity is increased at an interface, it causes an increased buoyancy
flux that varies linearly throughout a well mixed bottom boundary layer and
through the layer above.  This new capability is enabled by the new runtime
parameter DECAY_ADJUSTED_BBL_TKE and is implemented via the new internal
function exp_decay_TKE_adjust.

  In addition, this commit adds 9 bottom-boundary layer specific run-time
parameters that are analogous to parameters that set the properties of the
surface-driven mixing, and take their defaults from them, but can now be set
independently for the bottom boundary layer.  The inefficient option to use
bisection to estimate the bottom boundary layer depth was eliminated, as it
certain that it is not being used yet in any cases, and it should be deprecated
for the estimation of the surface boundary layer.

  By default, all answers are bitwise identical, but there are up to 10 new
runtime parameters that will appear in some MOM_parameter_doc files.
  Added the new optional unscale argument to reproducing_sum() and
reproducing_sum_EFP().  When this is used, the resulting real sum is restored to
the same scaling as the input arrays, but when there is an EFP_type returned it
is the same as it would have been had the input array been rescaled before it
was passed in.  All answers are bitwise identical, but thre is a new optional
argument to a two publicly visible interfaces.
  Added the new element RZL2_to_kg to the unit_scale_type for convenience when
rescaling masses and other globally integrated variables.  All answers are
bitwise identical, but there is a new element in a transparent type.
  Revised write_energy() and accumulate_net_input()  to work more extensively in
dimensionally rescaled variables by using the new unscale arguments to the
reproducing_sum functions. As a result of these changes, 15 rescaling factors
were eliminated or moved toward the end of write_energy().  All answers are
bitwise identical.
  Use reproducing_sum with unscale to calculate the global mass diagnostic,
globally integrated ocean surface area, offline tracer transport residuals and
in calculating the OBC inflow area in tidal_bay_set_OBC_data().  This change allows
for the elimination or replacement of 6 rescaling factors and one added instance
with a consistent conversion factor and diagnostic units occur on the same line.
All answers are bitwise identical.
Co-authored-by: Robert Hallberg <Robert.Hallberg@noaa.gov>
Correct a bug that G%gridLonT/G%gridLatT were not using global indexing
in outputing extreme surface information.
Previously extreme surface message is triggered when `sea_lev` is
smaller than or **equal** to ocean topography:

sfc_state%sea_lev(i,j) <= -G%bathyT(i,j) - G%Z_ref

The equality is innocuous for a non-zero minimum thickness
(Angstrom/=0), but it can be problematic for true zero thickness.

Besides, there is the fourth criterion in this check:

sfc_state%sea_lev(i,j) + G%bathyT(i,j) + G%Z_ref < CS%bad_val_col_thick

This is supposed to allow a user-specified tolerance (default=0.0),
which should be a more stringent check when the tolerance is larger than
zero. The equality from the first check contradicts this logic.
Fix a bug that sea surface height averaged over every dynamic time step
(SSH_inst) is not outputted correctly.
* Update MOM_wave_interface.F90

The index at the interface has been changed from I-1 ->i+1 and J-1 -> j+1, which helps fixed model error in 3D simulations while keeping halo_size to 1.

* Update MOM_wave_interface.F90

Corrected the index for thickness which fixed the issue of 3D wave coupled simulations while keeping the halo_size in thickness as 1
- The present code only tests BBL_EFFIC when deciding whether to set the bottom TKE and ustar, but this means they are all zero when EPBL_BBL_EFFIC is non-zero and BBL_EFFIC is zero.
- Adds logic to also check EPBL_BBL_EFFIC, thereby allowing non-zero ustar and TKE for EPBL_BBL_EFFIC>0.0
- Fixed BBL_TKE diagnostic in EPBL that was not populated.
- Will change answers when EPBL_BBL_EFFIC>0.0, but won't change answers in any of our existing configurations.
* Move MOM_generic_tracer with no changes

Renames src/tracer/MOM_generic_tracer.F90
to config_src/external/GFDL_ocean_BGC/MOM_generic_tracer.F90

* Remove MOM_generic_tracer

Git rm MOM_generic_tracer.F90

---------

Co-authored-by: Theresa Morrison <Theresa.Morrison@gaea68.ncrc.gov>
  Added missing factors to the conversion arguments in the register_diag_field
calls for the diagnostics of the KPP non-local transport tendency and the net
surface tracer fluxes, and corrected the dimensional scaling that is being
applied to the KPP_Vt2 diagnostic as calculated in KPP_compute_BLD.  All
solutions are bitwise identical, but now output files with these 3 sets of KPP
diagnostics are invariant to dimensional rescaling.  This can be verified with
the visc.nc file generated by the single_column/KPP test case in MOM6-examples.
Whereas previously the diagnostics KPP_Vt2, KPP_QminusSW, KPP_netSalt,
KPP_NLT_dTdt and KPP_NLT_dSdt in that file would change when dimensional
rescaling was applied, now they do not.  No output is changed unless dimensional
rescaling is used.
  The `default=` optional argument to get_param() only provides a uniform value
to initialize an array of integers.  This commit adds an optional `defaults=`
argument to get_param_int_array(), doc_param_int_array() and
log_param_int_array() to allow for the specification of an array of default
values.  These additions are analogous to what had previously been added for
real arrays in github.com//pull/760.

  This commit also adds the new internal function int_array_string(), analogous
to real_array_string(), in MOM_document.  This differs slightly from its real
array counterpart in that it only uses the syntax like `3*75` for lists of
integers that are longer than we would use to specify dates and times or pairs
of layout parameters, because "(0, 0)" seems more readily interpretable than
"(2*0)".

  The new defaults argument is now used in the get_param calls for LAYOUT and
IO_LAYOUT, and in setting the tidal reference dates.

  Several spelling errors in comments were also corrected in the files that
were being edited.

  All answers are bitwise identical, but there are minor changes in many
MOM_parameter_doc.layout files and some MOM_parameter_doc.all files.
marshallward and others added 29 commits April 2, 2025 11:21
This fixes the path for the optimized target builds (opt_target) used in
the profiling tests.  Previously they were pointing to the wrong
directories and were never built.

Due to oddities of rule order and overrides, the optimized targets were
using the latest code.  That is, **the perf tests were being run against
themselves**.

The perf and profile rules now correctly compare optimized output to the
optimized target output (typically dev/gfdl).
perf diff output used a relatively small value for comparison (32 chars)
which unfortunately does not work for many functions in Fortran modules.

This number has been increased to 60, and is now set as a variable
within the code.
Issues with the cray-libsci module have caused issues with the Intel C
compilers on Gaea.  This patch unloads the cray-libsci module and
restores access to the C compiler.
  Restore the parentheses to the expressions for PFv in btloop_update_v so that
the model will once again respect rotational symmetry when fused-multiply-adds
are enabled.  These parentheses were in the corresponding expressions until PR
#845 to dev/gfdl when they were inadvertently omitted.   This commit changes
answers (and restores rotational symmetry) when fused-multiply-adds are enabled.
  Replaced the 8 Coriolis parameter arrays ([abcd]zon and [abcd]mer) with merged
3-dimensional arrays (f_4_u and f_4_v), where the extra dimension is of size 4
and varies fastest.  This reduces the number of arguments being passed around
and it could improve performance by increasing the likelihood of cache hits.
Also made the dgeo_de argument to set_up_BT_OBC non-optional, which reduces the
number of calls to this routine and should increase the chances it might be
inlined.  A few more spelling errors were corrected, and some notes added in
comments highlighting potentially suspect reassignments of effective total
gravities at open boundary condition points.  All answers are bitwise identical
and no external interfaces are changed.
  Refactored totalTandS and totalStuff to optionally work with scaled
variables, by adding an optional unit_scale_type argument and a optional
argument specifying the unscaling of thickness to totalTandS and an optional
unscale argument to totalStuff.  The comments describing the units of 19
variables were modified to reflect the various units that might be used.  All
solutions are bitwise identical, and output is unchanged when dimensional
rescaling is not being used, but the debugging output can now be unaltered by
the use of dimensional rescaling.  There are new optional arguments to two
publicly visible routines.

  This change has been tested via calls to totalTandS added to step_MOM_thermo,
but because totalTandS is only intended for debugging, these testing calls are
commented out.  I am uncertain whether to ultimately retain these comments to
illustrate the use of totalTandS or whether to delete them before this PR is
merged in, but retaining them for now seems like they may help the PR review
process.
* Adds new buoyancy flux diagnostics to diabatic_ALE

- Adds the ability to diagnose the implied work and buoyancy flux due to given diffusivities inside of diabatic_ALE
- Adds outputs of N2 after applying diffusion, the implied bouyancy flux, the integrated buoyancy flux, the salinity components of buoyancy flux and N2, the temperature components of buoyancy flux and N2, the ePBL contribution to the buoyancy flux, and double diffusion contribution to the buoyancy flux,
- Only considers the contribution of KPP to the total diffusive flux, the diagnostics should not be used w/ KPP until the non-local term and its contribution are properly accounted for.
- Only considers the contribution of convection to the total diffusive flux, since diagnose its implied work would require accounting for the convective contributions to diffusivity (it would be easily doable following this template in the future, but it is not done here).
- More work is needed to apply these methods to diffusivities that are diagnosed and stored in set_diffusivity.
- This code is not repeated for diabatic_ALE_legacy since the double diffusion component to diffusivity is applied separately, thus complicating the interpretation of these diagnostics in the legacy driver.

* Renaming diagnostics for buoyancy flux dz in diabatic_driver

* Fix bad logic (>0 checks) in MOM_diabatic_driver id_ tests

* Style updates in MOM_diagnose_KdWork

* Correct bug in previous style commit

* Atempt to fix unclear issue related to MOM_error in MOM_diagnose_KdWork

* Fix some potential diagnostic logic issue in MOM_diabatic_driver.

* Fix scaling and update units for Buoyancy Flux diagnostics

- Change buoyancy flux units to W/m3 (W/m2 for vertical integrated terms).
- Correct issues in unit scaling for Boussinesq and non-Boussinesq code.
- Compute N2 from specific volume for non-Boussinesq mode.

* Update units/scaling for non-Boussinesq in buoyancy flux diagnostic

* Reorder spatial do-loops to k,j,i in MOM_diagnose_KdWork for efficiency.

* Move Kd Work/buoyancy flux diagnostics

- Kd work diagnostic calculations migrated from MOM_diabatic_driver to MOM_diagnose_KdWork
- Kd work diagnostics stored in new type which contains copies of all requested diffusivities to facilitate storing and moving data around so it is accessible at the conclusion of diabatic_driver.
- All Kd arrays are allocatable that won't be allocated unless needed.
- 2d Vertical and volume integral quantities added.
- Many more Kd work diagnostics added including many from MOM_set_diffusivity, still not exhaustive.

* Undo reordering control structures in MOM_set_diffusivity

* Add KdWork diagnostics for legacy driver and add warning about not using in layer mode

* Fix KdWork issue in legacy driver and whitespace

* Reduce line lengths in MOM_diagnose_KdWork

* Missing factor of 0.5 in buoyancy flux

* Addresses failure when KdWork from ePBL diagnostic is on, but ePBL is off

* Remove white space around ' = ' for optional arguments

* Remove all array syntax addition and replace with explicit do-loops

* Fix typo mismatch of bdif_dz diagnostics

---------

Co-authored-by: brandon.reichl <Brandon.Reichl@noaa.gov>
This commit consolidates many of the CI tests into a single workflow.
In principle, it replaces multiple builds of the model with a single
workflow.  In many respects, this resembles our regression suite on
Gaea.

In practice, it is only marginally faster by some metrics, and slower by
others.

* Although we don't rebuild FMS and the models, we do have to set up the
  ubuntu environment on every stage which has a cost (at least 30s).

  If we can eliminate this overhead somehow (e.g. container?) then this
  method may be much faster.

* MacOS is now slower, but we are running more tests.  Previously, we
  were only doing a small handful of them.

* Setting up the artifacts is delicate (paths, for example), and there
  is some overhead in transfer, but it appears to be small (<10s).

* Cleanup relies on a third-party GitHub Actions "app", which may not be
  supported over long term.

  Also, a failed app will also fail to delete its artifacts.  If these
  accumulate, then we may hit a storage limit.

  Worst case: Artifacts are wiped after 1 day.  This is the minimum
  setting.

* Minor point: As the number of tests increase, the graph scales to fit
  in the same box, and the actual stages become almost unreadable
  without zooming in.  Not really the effect I was hoping for.

Notes on MacOS (aarch64) builds:

* Disable fp-contract

  Seems like aarch64 (ARM) needs -ffp-contract=off to disable FMAs.
  -mno-fma is not supported!

* Optimization is reduced to -O1 for DEBUG-REPRO equivalence.

  This is only an issue in the tidal forcing, and most likely due to
  trigonometric function evaluation.  But I can't yet see any other
  solution.

There are also some very minor changes to the .testing Makefile:

* Log output is now consistently set to 100 lines for all tests.

* Builtin rule flags were replaced with explicit long forms

* Empty SUFFIX: rule was added, to belabor the point.
This is a refactor of AX_FC_REAL8.  The GNU test for -f-default-real-8
and -fdefault-double-8 is split into two separate tests.  The macro was
also redesigned to avoid duplication of the test program, and to follow
the same pattern as in autoconf's lib/fortran.m4.

The AMD flang compilers require -fdefault-real-8 but not
-fdefault-double-8.  We now test for the first flag before testing for
both.

This patch will slightly improve support for AMD flang compilers.

Co-authored-by: Alistair Adcroft <adcroft@users.noreply.github.com>
USE_DIAG_AS_CHKSUM will replace diagnostic calls with checksum data,
enabling verification of diagnostic output.  Although it is printed in
the order of calculation, there is no information about the model state.

If the post_data calls are reordered without changing the answers, then
current testing will report an error, since it has no way of
distinguishing between a reorder and an actual regression.

This test appends a timestamp to the output.  Time is based on the
`Time_end` of the diag mediator.  Format is "day seconds". This allows
the output to be sorted for regression testing.
This patch uses the new timestamped chksum_diag output for sorting, so
that diagnostics can be verified as order-independent.

Sorting is achieved by copying the appending timestamp as a prepended
timestamp, and then passed to sort.

This is still a jury rigged solution, but it prevents false regressions
from a reordered diagnostic.
* In barotropic solver, add two diagnostics WaveDraguBT and WaveDragvBT
for 2D linear wave drag accelerations, which includes both traditional
and frequency-filtered.

* Add diagnostics of KE sources from decomposed barotropic solver
contribution, which are the anomalous pressure gradient term (KE_BTPF,
including the offset for center differencing in time), the anomalous
Coriolis term (KE_BTCF) and linear wave drag (KE_BTWD, when calculated).

* Add KE source diagnostics of combined PF and CA terms with both
barotropic and baroclinic contributions (PE_to_KE_btbc and
KE_Coradv_btbc)
* Add diagnostics "tides_[uv]" and "sal_[uv]" for momentum acceleration
due to tides and self-attraction and loading (SAL). These terms are
recalculated by taking the gradient of height anomalies.

* Accordingly, two new diagnostics for KE contributions from tidal
forcing ("KE_tides") and SAL ("KE_SAL") are added. These two terms are
components of PE_to_KE.

* The new diagnostics are only added to finite volume PGF and not
available in Montgomery PGF.
  Restore the parentheses to the expressions for Cor_v in btloop_update_v so
that the model will once again respect rotational symmetry when
fused-multiply-adds are enabled.  These parentheses were in the corresponding
expressions until PR #845 to dev/gfdl when they were inadvertently omitted.
This commit changes answers (and restores rotational symmetry) when
fused-multiply-adds are enabled.
  Add code to zero out bt_pgf_u and bt_pgf_v at open boundary condition points,
thereby altering the pressure diagnostics that are derived from these arrays.
At OBCs, the pressure force is not well defined, and in fact the u_accel_bt and
v_accel_bt arrays upon which recently added calculation of bt_pgf_u and bt_pgf_v
are based is immediately replaced by a different OBC-specific expression for
the accelerations at OBC points.  Moreover, at OBC points bt_pgf_u and bt_pgf_v
were being calculated using values of eta_anom that were projected outward
across OBCs, but where this means that two different OBCs could be trying to set
these projected values of eta_anom to two different values (e.g, at a convex
corner in OBC segments or at back-to-back OBC segments separated by a single
land point). In such cases the diagnostic becomes indeterminate depending on
which OBC segment is applied first.  Zeroing out this diagnostic removes this
ambiguity.  This change does change (and correct) some diagnostics in energy
budgets at OBC points, but all solutions are bitwise identical and public
interfaces are unaltered.
Recovers the conditional code that sets use_ice_shelf in ALE_regridding_and_remapping()
  Added code to avoid a separate OBC block within btloop_update_u and
btloop_update_v.  This was done by adding static CS%OBCmask_[uv] arrays that are
1 except at OBC points where they are 0, and using them to set BT_force_[uv],
IdxCu. IdyCv, f_4_u and f_4_v to Conversely bt_rem_u and bt_rem_v are being set
to 1 at OBC points.  Because all the changes occur by masking 2-d arrays during
the barotropic setup, there should be a net increase in model speed (if
anything).

  Also ensured that all allocates for arrays in the MOM_barotropic control
structure are paired with appropriate deallocates in barotropic_end.  As a part
of this, 7 arrays (lin_drag_u, lin_drag_v, ubt_IC, vbt_IC, D_u_Cor, D_v_cor and
q_D) that are only allocated and used conditionally were converted from
potentially static memory arrays into arrays that are always allocatable.

  The comments describing the nature and purpose of the Rayleigh_u
and Rayleigh_v arrays were expanded, and commented out code that might use a new
answer date to avoid some unnecessary extra divisions in btstep_find_Cor was
also added.

  All answers are bitwise identical and no public interfaces are changed.
  Moved the code calculating the barotropic pressure gradients into the new
subroutine btloop_find_PF.  This simplifies the code in btloop_update_u and
btloop_update_v, which should make it more likely that these are inlined.
All answers are bitwise identical and no public interfaces are changed.
  Simplified btloop_eta_predictor by moving several optional loops out into the
main body of btstep_timeloop, to increase the chances that this routine will
become a good candidate for inlining, and to take steps toward the reuse of the
code calculating the barotropic fluxes in the predictor and corrector steps. In
addition, several unused variables were eliminated, and the arguments to
btloop_eta_predictor are now documented in the order in which they appear in the
argument list.  All answers are bitwise identical and no public interfaces are
changed.
  Use two separate calls to btloop_find_PF an eta argument that depends on
BT_project_velocity, thereby eliminating the need for the eta_PF_BT pointer to
the eta array that will be used for the barotropic pressure force.  All answers
are bitwise identical and no public interfaces are changed.
  Refacted the code implementing the barotropic open boundary conditions to
replace the direct references to the segments with simple 2-d arrays in the
BT_OBC type for everything inside of the performance-critical routine
btstep_timeloop.  There are several new or renamed elements inside of the
BT_OBC_type and three new integer parameters enumerating the types of open
boundary conditions that are to be applied at each point, with the sign of
integers in the new u_OBC_type and v_OBC_type arrays indicating the direction of
the OBCs, and 0 indicating no OBC at a cell face.  There is a new routine,
initialize_BT_OBC, that stores the static information about the position and
nature of the various OBCs for use within the barotropic time stepping; this
routine is called from barotropic_init.  Several spelling errors in comments
were also corrected.  All answers are bitwise identical and no public interfaces
are changed.
  Split the loops applying western and eastern open boundary conditions in
apply_u_velocity_OBCs and southern and northern OBCs in apply_v_velocity_OBCs.
These usually cover very different loop ranges, and often there is only one of
the pair of OBCs on any processor, so it should be more efficient to only apply
the OBCs on within the loop ranges where they apply (which can be determined
during initialization).  The code still works for arbitrarily located OBC
segments, and all answers are bitwise identical.
  Cleaned up the OBC-related memory allocation and moved it into
initialize_BT_OBC.  Also combined OBC-related halo passes to reduce latency
during the set-up phase of btstep.  Also simplified the code setting the various
gtot values around OBC points.  The OBC argument to btstep_timeloop was replaced
with a new logical variable in the barotropic control structure that is set
during initialization.  All answers are bitwise identical and no public
interfaces are changed.
  Moved the code to calculate the dynamic pressure into the new routine
btloop_add_dyn_PF to simplify btloop_find_PF.  Also revised btcalc to use
the BT_OBC_type to streamline the application of open boundary conditions
in the calculation of frhatu and frhatv.  All answers are bitwise identical
and no public interfaces are changed.
  Refactored btcalc to include the application of open boundary conditions in
the calculation of CS%frhatu and CS%frhatv within the same loops as the
non-OBC calculations, and also to reduce the amount of duplicated code.  This
should increase the model efficiency with open boundary conditions, and have
minimal performance impacts otherwise.  All answers are bitwise identical
and no public interfaces are changed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.