Skip to content

Conversation

mjschmidt271
Copy link
Contributor

@mjschmidt271 mjschmidt271 commented Jul 7, 2025

Adds diagnostic output for more tendencies coming out of Microphysics.


This adds some tendencies related to gas-aerosol exchange to the MicrophysDiagnosticArrays struct.

NOTE

Each of these quantities is a 3D field and is not column-integrated (as we see here) or scaled for unit changes (as is done here).

If the user wishes to integrate the quantities over a column, they can request the relevant field (e.g., in output.yaml for a standalone test) with the suffix _vert_sum_dp_weighted appended (see EAMxx docs for more details). That is, to get the 3d field, request mam4_microphys_tendency_nucleation, and if you prefer a column-integrated version, weighted by pseudo_density, then request mam4_microphys_tendency_nucleation_vert_sum_dp_weighted.

[NBFB] for only one optimized standalone test on CUDA, BFB otherwise

add_field<Computed>("mam4_microphys_tendency_renaming", vector3d_num_gas_aerosol_constituents, pow(mol, -1), grid_name);
add_field<Computed>("mam4_microphys_tendency_nucleation", vector3d_num_gas_aerosol_constituents, pow(mol, -1), grid_name);
add_field<Computed>("mam4_microphys_tendency_coagulation", vector3d_num_gas_aerosol_constituents, pow(mol, -1), grid_name);
add_field<Computed>("mam4_microphys_tendency_renaming_cw", vector3d_num_gas_aerosol_constituents, pow(mol, -1), grid_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need mam4_microphys_tendency_renaming_cw? I think we only need the interstitial tendencies. Could you also please verify the units? I think the units should be mol/mol or kmol/kmol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea on whether we need the _cw quantity. The confluence page indicates that *_sfgaex2 is desired, and it's my understanding that this is calculated for the cloudwater quantities, as in the variable qqcwgcm_tendaa in mam4_amicphys.hpp. That was based on this bit I found in the fortran code

modal_aero/modal_aero_amicphys.F90:223

character(len=8) :: suffix_q_coltendaa(nqtendaa) = &
     (/ '_sfgaex1', '_sfgaex2', '_sfnnuc1', '_sfcoag1' /)
character(len=8) :: suffix_qqcw_coltendaa(nqqcwtendaa) = '_sfgaex2'

That said, I wanted to include it, just in case, so I could delete rather than have to add code later. Happy to get rid of it if you want

Copy link
Contributor Author

@mjschmidt271 mjschmidt271 Jul 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the units, it was unclear to me whether the units vary based on which proccess provides the quantity--e.g., one unit for coagulation and another for rename OR units vary based on species, in which case there's not much we can do unless we want to split the diagnostic arrays into even smaller species-level variables.

And this is just based on modal_aero_amicphys.F90:423

// qgcmN and qqcwgcmN (N=1:4) are grid-cell mean tracer mixing ratios
    // (TMRs, mol/mol or #/kmol)
    //    N=1 - before gas-phase chemistry
    //    N=2 - before cloud chemistry
    //    N=3 - incoming values (before gas-aerosol exchange, newnuc, coag)
    //    N=4 - outgoing values (after  gas-aerosol exchange, newnuc, coag)
    // qgcm   (ncnst) : grid cell mean, interstitial constituents (unit does not
    //                  matter)
    // qqcwgcm(ncnst): grid cell mean, cloud-borne  constituents (unit
    //                 does not matter)

In chatting with @mahf708 and @TaufiqHassan, it seemed like the best option would be to got with mol^-1 and account for the numerator and/or factor of 1e3 in post-processing.

I'm certainly open to other solutions, though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right on both points. Let's keep the _cw species.
Units depend on the species (# mixing ratio vs. volume mixing ratio). We can keep the units as mol^-1. This array is intended to be used to extract individual species, and there we can be more specific about the units.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it. That makes sense to me, then!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the field is a "collection" of subfields with different units (much like a FieldGroup), you can also set units to nondimensional and attach an "invalid" string to it. That way, in the output file you will clearly see that the units of this field should NOT be considered.

Once you get the field from the AD, you can also attach additional (and arbitrary) meta data to the field that is intended for IO, like a string explaining this. E.g., in initialize_impl:

using stratts_t = std::map<std::string,std::string>;
auto& f = get_field_out("mam4_microphys_tendency_renaming_cw");
auto& io_stratts = f.get_header().get_extra_data<stratts_t>("io: string attributes");
io_stratts["doc"] = "This is a 'collection' of eterogeneous fields. Each species has different units";

IO will take care to transfer this to the file (even if there are layers of remapping in the IO sequence).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! @mjschmidt271: Could you please add this for the units of these diagnostics variables?

@mjschmidt271 mjschmidt271 changed the title rework diagnostic outputs Adds diagnostic output for *_sfgaex1, *_sfgaex2, *_sfnnuc, *_sfcoag Jul 7, 2025
@mjschmidt271 mjschmidt271 marked this pull request as ready for review July 7, 2025 23:35
Copy link
Contributor

@singhbalwinder singhbalwinder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Mike!

@singhbalwinder
Copy link
Contributor

I am not sure why mam4_aero_microphys_standalone_baseline_cmp is failing (cuda/opt)....

Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have spelled out "cw" more verbosely, but that's very minor. Looks good!

@singhbalwinder
Copy link
Contributor

We can replace "cw" with "cloud_borne".

@mahf708
Copy link
Contributor

mahf708 commented Jul 8, 2025

We can replace "cw" with "cloud_borne".

yeah, see, to me, "cloud-water" (I think that's what cw is for, right?) and "cloud-borne" aren't interchangeable, so I am often confused when I see "cloud water" --- I think of cloud water as an actual concrete thing that's wholly separate. Anything can be cloud borne (or in-cloud), but only one thing can be "cloud water" ...

@mjschmidt271 mjschmidt271 force-pushed the mjs/eamxx/gas-tend-diags branch from 8394394 to ceafe76 Compare July 9, 2025 02:39
@singhbalwinder
Copy link
Contributor

I looked into the lines that are causing the diffs in the CUDA-optimized build. If we comment out this block, the code is BFB w.r.t the baselines. Although this block is never actually executed (all the if conditions evaluate to false), its presence seems to affect results, likely due to compiler optimizations (e.g., -O3). The RMS differences are very small. I plan to revisit this code in the future to replace the raw arrays with Kokkos views.

Given that the optimized build for CPU/OpenMP is BFB and all other tests (optimized and release) are BFB, I think we should go ahead and merge it. We have seen similar behavior in the past, where optimized builds for some compilers are NBFB even for CPU builds, while the debug builds are BFB.

@rljacob
Copy link
Member

rljacob commented Jul 9, 2025

What happens if you lower optimization to -O2 and leave the block in?

@singhbalwinder
Copy link
Contributor

I was also curious about it, but it will require new baselines, as baselines should also be at -O2. I can run this experiment and report back.

@mahf708
Copy link
Contributor

mahf708 commented Jul 9, 2025

@bartgol @tcclevenger any advice on above?

@bartgol
Copy link
Contributor

bartgol commented Jul 9, 2025

@singhbalwinder did you try to put some print statements to verify that those code blocks are indeed never executed?

@singhbalwinder
Copy link
Contributor

singhbalwinder commented Jul 9, 2025

Yes, I put the prints in and outside of that block to verify that the code in that block is never executed in this test. If we uncomment just a single if condition, I see the diffs in bc_a4, mom_a4, and num_a4 variables only) :

if (diag_arrays.gas_aero_exchange_condensation.size()) {
      for (int i = 0; i < gas_pcnst; ++i) {
        diag_arrays.gas_aero_exchange_condensation(i, kk) =
            gas_aero_exchange_condensation[i];
      }
    }
//(rest of the if statements in that block are commented out)

If I then comment out the assignment line (diag_arrays.gas_aero_exchange_condensation(i, kk) = gas_aero_exchange_condensation[i];) above, the diffs disappear.

I see these diffs with O3, O2, and O1 optimizations. The diffs are very small (relative diff: ~3.5840E-12, absolute diff: ~ 4.0171E-20) and they appear for only this one test. The rest of the tests are passing.

EDIT: My O1 tests were using O3, so I am redoing my O1 test.

@overfelt
Copy link
Contributor

overfelt commented Jul 9, 2025

@singhbalwinder , This seems like a bug that Oscar and then I worked on a few months ago. In that case it was the array allocation itself that caused problems. When the use of the array was commented out the diff went away because the CUDA compiler would not allocate an array when it was not used in a significant way. I could see the allocation because the size of the executable changed by the byte size of the array. Never figured it out. Oscar changed some code for another reason and the diff went away. When you replace the fixed sized arrays with Kokkos views in a later merge request, it will fix itself.

@bartgol
Copy link
Contributor

bartgol commented Jul 9, 2025

That's quite mysterious. Perhaps when you comment the code, the compiler is able to detect that the static arrays are not needed, and does not create them. When uncommented, the compiler cannot do such assumption, and allocates them. These arrays may end up causing a large kernel memory usage, as they are allocated per thread. This may somehow hint at some race condition on some scratch memory, something that may be hard to track down. Out of curiosity, are the results BFB across consecutive runs?

I think in general we should avoid to allocate non-trivial arrays on a per-thread basis, as it's done here. I don't know what gas_pcnst is, but I assume it's ~40? Having 5 of these arrays means ~200 doubles per thread, which is 1.6KB. This can contribute to limit the size of a block on CUDA, as the amount of shared memory is limited (e.g., A100 has 164KB / SM, so it could only accommodate ~100 threads/block, 10x less than the max allowed, and that's only counting these 5 arrays).

Edit: sorry, I mixed up local and shared memory. But the bottom line still holds: we need to be careful with how much stack memory we request on a per-thread basis.

@singhbalwinder
Copy link
Contributor

Thanks @bartgol and @overfelt. I removed the static arrays and used views directly. The code has definitely improved, but the baseline comparison failure persists for the CUDA opt build.

@singhbalwinder singhbalwinder changed the title Adds diagnostic output for *_sfgaex1, *_sfgaex2, *_sfnnuc, *_sfcoag EAMxx: Adds diagnostic output for *_sfgaex1, *_sfgaex2, *_sfnnuc, *_sfcoag Jul 11, 2025
@singhbalwinder singhbalwinder self-assigned this Jul 11, 2025
@singhbalwinder singhbalwinder added EAMxx Issues related to EAMxx MAM4xx MAM4xx related changes labels Jul 11, 2025
@singhbalwinder
Copy link
Contributor

I repeated this test many times, and the results are BFB reproducible. So, this change is not causing any nondeterminism.

Comment on lines 166 to 176
void get_field_set_io_docstring(const std::string &fname) {
using str_atts_t = std::map<std::string,std::string>;
auto &f = get_field_out(fname);
auto &io_str_atts = f.get_header().get_extra_data<str_atts_t>("io: string attributes");
io_str_atts["doc"] = "Collection of heterogeneous quantities--species have different units";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add the docs of these as long_name attributes following https://docs.e3sm.org/E3SM/EAMxx/user/io_metadata/ instead of the manual string attires.

@bartgol I think the long name is more appropriate for this, what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the io_str_atts["doc"] be the long name for these diagnostics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think adding these docs is important, but if you are going to do it, you might as well do it properly following the link above. The long name is meant to be descriptive of the quantity, including caveats about it. So, I would add a a few-word description of each quantity, then add the caveat you're adding here.

If Luca says otherwise, that's fine. The worry is, without a central location, these docs strings may become unwiedly... and that's precisely why we made the io_metadata workflow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can modify this to include the "long name". I will wait for @bartgol to chime in before making this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought long name was only supposed to be a descriptive name, so I thought adding an extra metadata attribute for "notes" was ok. If you guys think long name is the place to include also caveats/notes and other miscellanea info, that's fine with me.

@singhbalwinder singhbalwinder force-pushed the mjs/eamxx/gas-tend-diags branch from 3393b59 to 6340b01 Compare July 16, 2025 03:32
// Diagnostics: tendencies due to gas phase chemistry [mixed units: kg/kg/s or #/kg/s]
add_field<Computed>("mam4_microphysics_tendency_gas_phase_chemistry", vector3d_num_gas_aerosol_constituents, nondim, grid_name);

// Diagnostics: tendencies due to aqueous chemistry [mixed units: kg/kg/s or #/kg/s]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is basically perfect for a long name (you just need to add "mam4 microphys" or something to the start and it will be 🍑

@singhbalwinder
Copy link
Contributor

We have addressed all the reviewers' comments, and this PR is ready to merge. We have one test failing. I think we can still merge as the diff is very small and it is just in the optimized build. All other optimized and debug tests are BFB.

Comment on lines +166 to +176
void add_io_docstring_to_fields_with_mixed_units(const std::map<std::string, std::string> &flds) {
using str_atts_t = std::map<std::string,std::string>;
for (const auto &pair : flds) {
// Get the field, and add a docstring to its string attributes
// This is used to document that the field contains heterogeneous
// quantities, i.e., species have different units.
auto &f = get_field_out(pair.first);
auto &io_str_atts = f.get_header().get_extra_data<str_atts_t>("io: string attributes");
io_str_atts["doc"] = pair.second;
}
}
Copy link
Contributor

@mahf708 mahf708 Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this is not a good idea. This will be forgotten and having a non-standard "doc" entry in the netcdf files isn't going to be all that illuminating or helpful, frankly. I suggest two things:

  1. For now: Add these strings (you can keep them exactly as they are) to the io_metadata csv file in the link I shared under long_name column
  2. For later: Add a src/diagnostic impl to parse out these fields into the more regular fields (col, lev) instead of the non-obvious fields (col, something_no_user_will_be_ever_able_to_figure_out_readily, lev)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, these fields are there only to expose these quantities. Ideally, we should have fields with names that make sense (e.g., so4_mam4_gas_phase_tendency) and extract specific tracers/species from these fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But up to you if you wanna do this right away, or merge and do later. I am not picky about these doc strings (so I am happy to pretend they are not there ;))

Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to worry about the roundoff diff, I'd say merge and monitor for a few days to ensure we are not hitting non-determinism.

@singhbalwinder singhbalwinder added the non-BFB PR makes roundoff changes to answers. label Jul 18, 2025
singhbalwinder added a commit that referenced this pull request Jul 18, 2025
EAMxx: Adds diagnostic output for *_sfgaex1, *_sfgaex2, *_sfnnuc, *_sfcoag

Adds diagnostic output for more tendencies coming out of Microphysics.

This adds some tendencies related to gas-aerosol exchange to the
MicrophysDiagnosticArrays struct.

NOTE
Each of these quantities is a 3D field and is not column-integrated
(as we see here) or scaled for unit changes (as is done here).

If the user wishes to integrate the quantities over a column, they
can request the relevant field (e.g., in output.yaml for a standalone
test) with the suffix _vert_sum_dp_weighted appended (see EAMxx docs
for more details). That is, to get the 3d field, request
mam4_microphys_tendency_nucleation, and if you prefer a
column-integrated version, weighted by pseudo_density, then request
mam4_microphys_tendency_nucleation_vert_sum_dp_weighted.

[NBFB] for only one optimized standalone test on CUDA, BFB otherwise

* mjs/eamxx/gas-tend-diags:
  Adds long name and message about mixed units
  Update MAM4xx submodule to include view2d definition
  Clang for MAM4xx codes only
  Use 2D to fetch diagnostics rather than subviewing
  Adds mam4xx updates
  A working version for only one diag-needs cleanup
  microphys -> microphysics
  Working version of io docstring
  Code may not build - Moves the io doc string call to initialize impl
  add metadata strings to diag fields and units
  Update MAM4xx: changes for to parallel_for
  Removes arrays and use views directly in MAM4xx submodule
  disable output of new diagnostics
  switch _cw to _cloud_borne
  rework diagnostic outputs
@singhbalwinder singhbalwinder merged commit ab379b8 into master Jul 18, 2025
17 of 18 checks passed
@singhbalwinder singhbalwinder deleted the mjs/eamxx/gas-tend-diags branch July 18, 2025 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EAMxx Issues related to EAMxx MAM4xx MAM4xx related changes non-BFB PR makes roundoff changes to answers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants