EAMxx: Fix theta for surface coupling export bug #7434

whannah1 · 2025-06-11T16:21:58Z

This updates the value of potential temperature exported to the surface models to be consistent with how the coupler calculates fluxes.

Fixes #7407

[non-BFB]

mahf708

Will do a clean-up PR later related to other names in this file (as discussed with @tcclevenger). Outside scope of PR. For now, this is good to go and will achieve the desired outcome. Let's not merge until @oksanaguba approves.

rljacob · 2025-06-11T16:35:48Z

Can you provide any more detail about how this PR changes the climate?

oksanaguba · 2025-06-11T16:38:39Z

I am puzzled by the choice of p0 in formula Exner=(p/p0)^kappa. AMS uses p0=1e5, https://glossary.ametsoc.org/wiki/Potential_temperature , why would we use something else? Is it going back to the idea that Exner ~= 1 at the bottom mid level? Why is it important for it to be 1? (considering that no matter what const we choose, it won't be 1 over a lot of the domain anyway).

Separately, are we sure the proposed code is consistent with coupler's code? I found Fortran definition of Exner

          phys_state(lchnk)%exner (i,k) = (phys_state(lchnk)%pint(i,pver+1) &
                                          /phys_state(lchnk)%pmid(i,k))**cappa

which I am not sure is used in eamxx. What does the coupler do in case of eamxx runs?

oksanaguba · 2025-06-11T16:39:42Z

Separately, IIRC shoc and p3 will use p0=1e5, which makes the proposed definition inconsistent with physics?

bogensch · 2025-06-11T16:51:54Z

This is the Exner function that EAMxx uses. It also uses p0=1e5. The hybrid_ref_pmid_bot term in Walter's PR is NOT p0, it is p.

E3SM/components/eamxx/src/share/util/eamxx_common_physics_functions_impl.hpp

Lines 85 to 97 in f7f3997

    
           template<typename DeviceT> 
        
           template<typename ScalarT> 
        
           KOKKOS_INLINE_FUNCTION 
        
           ScalarT PhysicsFunctions<DeviceT>::exner_function(const ScalarT& pressure) 
        
           { 
        
             using C = scream::physics::Constants<Real>; 
        
             static constexpr auto p0 = C::P0; 
        
             static constexpr auto rd = C::RD; 
        
             static constexpr auto inv_cp = C::INV_CP; 
        
             return pow( pressure/p0, rd*inv_cp ); 
        
           }

oksanaguba · 2025-06-11T16:56:12Z

Ok, thanks, while i am grepping more, could you possibly please provide a better description in the PR, with formulas instead of words? Also, if this is consistent with the coupler, which code in the coupler are we talking about? thanks.

bogensch · 2025-06-11T16:57:01Z

It is important that Exner be 1 at the surface (that is the bottom interface level) because that is what the coupler assumes. The coupler does not have an Exner function but it assumes that the sea surface temperature (SST) absolute temperature is identical to the potential temperature. The only way to ensure this is for Exner to be 1 at the surface. Again, the potential temperature value we are passing to the coupler is NOT the surface value, but the mid-point directly above the surface. Thus Exner should be ~1 at this level and use a formulation that evaluates exactly to 1 at the surface.

mahf708

Note we need to fix the tests (spoke to Walter separately). Also, address Rob's and Oksana's concerns... requesting changes and changing to draft to get this to run once in the CI.

oksanaguba · 2025-06-11T17:03:20Z

Re The coupler does not have an Exner function but it assumes that the sea surface temperature (SST) absolute temperature is identical to the potential temperature. -- would you please point me towards the code?

Separately, if the fix for Exner=1 is needed at the bottom, isn't it easier just to use that explicitly instead to trying to guess how to make it 1?

bogensch · 2025-06-11T17:15:45Z

Again the value we are actually evaluating is NOT at the surface. It is the mid-point directly above. Thus we need to calculate it, we are not guessing it.

E3SM/share/util/shr_flux_mod.F90

Line 686 in f7f3997

real(R8) ,intent(in) :: thbot(nMax) ! atm potential T (K)

E3SM/share/util/shr_flux_mod.F90

Line 699 in f7f3997

real(R8) ,intent(in) :: ts (nMax) ! ocn temperature (K)

E3SM/share/util/shr_flux_mod.F90

Line 821 in f7f3997

    
           delth = thbot(n) - ts(n)                     ! Pot. temp. difference with surface (K)

The above line shows the crucial line where it takes potential temperature that we are passing to the coupler from the lowest mid-point level (again, NOT the surface) difference with SST (ts). The only way this computation is valid since SST (ts) is temperature is assuming that absolute temperature = potential temperature EXACTLY for this variable. This is from the shr_flux_atmOcn_UA scheme that EAMxx uses (though this is identical in the shr_flux_atmOcn scheme).

oksanaguba · 2025-06-11T17:48:18Z

I see no issues with the code above -- variable ts (which the comment says comes from the ocean) is theta_s, as you said, because at the sea level p=p0 (not sure it is true everywhere, but it is a different question). So clearly the original code computes delta(theta), not delta(T), with the usual definition of Exner (p/p0)^kappa. Also, their paper states it is delta(theta) https://journals.ametsoc.org/view/journals/clim/11/10/1520-0442_1998_011_2628_iobaaf_2.0.co_2.xml .

I believe that you may see better climo with some changes in this code, the whole surf fluxes code may use much more attention wrt tuning, it is just not clear that defining Exner differently is the proper explanation of the changes. Also, it seems that EAM code (unintentionally?) made an inconsistent change in Exner definition. If EAM climo is the target here, would it make sense to make the same change as EAM for now instead?

whannah1 · 2025-06-11T18:06:44Z

variable ts (which the comment says comes from the ocean) is theta_s, as you said, because at the sea level p=p0

@oksanaguba That is not a correct statement

at sea level the "real" pressure is still variable in time and space - not equal to p0

whannah1 · 2025-06-11T18:09:50Z

Can you provide any more detail about how this PR changes the climate?

@rljacob We only have an ne256 test run by Hassan with a different, but functionally identical, version of this fix. I think those results are posted on confluence, but we should probably run a new 1-yr ne256 test.

oksanaguba · 2025-06-11T18:20:21Z

at sea level the "real" pressure is still variable in time and space - not equal to p0 -- isnt it then better to change ts -> ts/Exner instead?

whannah1 · 2025-06-11T18:28:29Z

at sea level the "real" pressure is still variable in time and space - not equal to p0 -- isnt it then better to change ts -> ts/Exner instead?

Maybe, but I think any way we change this code would require changing EAM. Peter and I talked about this and weren't sure which option made more sense. If the change to EAMv3 were non-BFB (not sure if this is avoidable or not) then we would have to undertake an extensive effort to validate it for both LR and HR configurations.

oksanaguba · 2025-06-11T19:38:19Z

thanks, then could you instead adopt what EAM is doing? It is using Exner=(p/p_bot)^kappa for that code for thbot computation. it also alters surf coupling scheme, but this way we would remember that both EAM and eamxx use the same modification.

whannah1 · 2025-06-11T22:24:38Z

thanks, then could you instead adopt what EAM is doing? It is using Exner=(p/p_bot)^kappa for that code for thbot computation. it also alters surf coupling scheme, but this way we would remember that both EAM and eamxx use the same modification.

That's exactly what Naser's previous PR was meant to facilitate - but I thought you strongly argued against that approach?

oksanaguba · 2025-06-11T22:51:03Z

thanks for the explanation, yes, let's proceed with Naser's original PR https://github.com/E3SM-Project/E3SM/pull/7411/files ; maybe, let's put an explanation in it about what's going on.

whannah1 · 2025-06-12T14:28:50Z

thanks for the explanation, yes, let's proceed with Naser's original PR https://github.com/E3SM-Project/E3SM/pull/7411/files ; maybe, let's put an explanation in it about what's going on.

I was actually convinced by your argument that Naser's approach is bad and that this approach is better, so I don't think we should switch back. Can you explain in more detail why you now think it's ok to allow the value of P0 to change in these rare circumstances?

Are you arguing that Naser's approach was easier to understand? Because I feel the current approach is more transparent. To be clear, both could be described as "hacks" to accommodate the quirk of the coupler flux calculations - we just need to decide which hack is "better".

oksanaguba · 2025-06-12T23:02:45Z

I like Naser's approach because it is the same as in EAM v3. Both approaches, this one and EAM's, are tunings of the surf. coupling schemes. They both do not fix anything, because to fix surf. coupling, we should convert ts into theta_surf variable. This is how the original formulation in papers is done. So both approaches act as a change in some coefficient in surf. coupling parameterizations. If so, imo it is better to stick to the same approach in eamxx as in v3 instead of 2 different one.

Separately, if you are concerned about preserving v3 code, but would like to change ts into theta_s, it probably can be done with pragmas, so that v3 runs with the old version of the code, but the new version runs in scream.

oksanaguba · 2025-06-13T02:04:00Z

Were yesterday's results from Hassan done with Naser's fix? did anyone compare climo from both (if it was presented yesterday, i missed that)?

mahf708 · 2025-06-13T14:00:49Z

I don't think any legit evaluation of this has happened in global runs; I think it was done with a constant value as a preliminary test.

So, here's my opinion: We can go back to the EAM-style hack. I don't think my PR is the correct approach though. At first, I got the impression that it was customary to define the exner function with different references (p0 vs p_ref vs something else). I now agree with Oksana that is just wrong. There's obviously a problem in how all of this was happening in EAM/CAM, etc., including the coupler requiring the odd specs.

I see two paths to resolving this:

Preferably, fix this problem at the root, no matter how annoying and invasive. The essential part of this is to ensure consistent definitions across all corners of atm, including how the coupler interfaces with it. This will necessitate changes to the delth = thbot(n) - ts(n) line eventually.
Less preferably, implement workarounds and hacks like in EAM. In that case, I propose we just inline calculation the stuff that needs to be calculated, see below for an example.

    if (export_source(idx_Sa_ptem)==FROM_MODEL) {
-      Sa_ptem(i) = PF::calculate_theta_from_T(s_T_mid_i(num_levs-1), s_p_mid_i(num_levs-1));
+      // HACK AHEAD: add lengthy note explaining the rationale
+      Sa_ptem(i) = s_T_mid_i(num_levs-1) / pow(s_p_mid_i(num_levs-1)/s_p_int_i(num_levs), PC::RD*C::INV_CP);

whannah1 · 2025-06-13T14:10:51Z

If we do stick with the hack approach (which I'm ok with) it seems there will be a non-negligible difference between the two possible approaches because of the difference in pressures

Naser's PR replacing P0 => s_p_mid_i(num_levs-1) vs s_p_int_i(num_levs)
This PR replacing P => p_ref_mid(num_levs-1) vs P0

The former allows exner of the lowest layer to vary in space, while the latter effectively makes exner a constant everywhere.

I'm struggling to form an opinion on whether the impact of this choice is worth testing... In the global mean it's probably negligible, but perhaps locally it could make a notable difference, especially at higher latitudes where the pressure gradients are stronger. Both still address the issue that Peter identified.

oksanaguba · 2025-06-13T17:00:30Z

Do you guys want to decide on which fix to implement using climo or using arguments about concepts? It seems there are 3 options: 1) Naser's original, 2) Walter's, and 3) trying to replace ts (without affecting v3). Imo climo is the most important argument, esp. if strong biases can be eliminated. If not using climo, I would have a slight preference towards ts fix as it is most conceptually correct. Glad we were able to discuss this all in detail, thanks.

bogensch · 2025-06-16T21:04:00Z

For what it’s worth, I ran a DPxx case that’s particularly sensitive to this bug—specifically the CGILS S12 location (a coupled stratocumulus case, or “stratus” for the lay person 😉). I tested three versions: one with the current master (i.e., with the bug), one with Walter’s fix (this PR), and one with Naser’s fix (from the previously aborted PR). As expected, the results from Walter’s and Naser’s fixes are nearly identical.

Personally, I don’t think we need to spend too much time agonizing over which fix to go with. But if you put a gun to my head, I’d lean toward Naser’s PR—for many of the reasons Oksana already laid out above.

Diags from my runs are available here. At first glance, the profile plots look bit-for-bit identical, but there are some subtle differences—check out the 1D time series plots for those (just to prove I'm not accidentally plotting the same run twice!).

whannah1 · 2025-06-17T15:32:55Z

As expected, the results from Walter’s and Naser’s fixes are nearly identical.

This makes sense, glad you checked though.

Personally, I don’t think we need to spend too much time agonizing over which fix to go with. But if you put a gun to my head, I’d lean toward Naser’s PR—for many of the reasons Oksana already laid out above.

I think I'm leaning this way as well, but I don't know how to decide when the difference in the results is negligible.

whannah1 · 2025-06-17T18:19:13Z

Closing this branch to start fresh with a new PR that implement a simplified version of Naser's approach.
see PR #7441

whannah1 added 2 commits June 11, 2025 11:15

modify theta calculation for coupling exporter

de38e78

remove Spack to simplify bug fix

c7ffa37

whannah1 requested review from bogensch and mahf708 June 11, 2025 16:21

whannah1 assigned brhillman Jun 11, 2025

whannah1 added bug non-BFB PR makes roundoff changes to answers. EAMxx Issues related to EAMxx labels Jun 11, 2025

mahf708 requested review from oksanaguba and tcclevenger June 11, 2025 16:23

mahf708 added the CC PR is climate changing label Jun 11, 2025

mahf708 approved these changes Jun 11, 2025

View reviewed changes

mahf708 added bug fix PR and removed bug labels Jun 11, 2025

mahf708 requested changes Jun 11, 2025

View reviewed changes

mahf708 marked this pull request as draft June 11, 2025 17:02

move hybrid_ref_pmid_bot into kernel

54d0883

update surface_coupling test

f0d4d14

fix surface coupling test

28fd66d

whannah1 closed this Jun 17, 2025

whannah1 deleted the whannah/eamxx/fix-theta-cpl-bug branch June 17, 2025 18:20

EAMxx: Fix theta for surface coupling export bug #7434

EAMxx: Fix theta for surface coupling export bug #7434

Uh oh!

Conversation

whannah1 commented Jun 11, 2025 • edited by mahf708 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mahf708 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rljacob commented Jun 11, 2025

Uh oh!

oksanaguba commented Jun 11, 2025

Uh oh!

oksanaguba commented Jun 11, 2025

Uh oh!

bogensch commented Jun 11, 2025

Uh oh!

oksanaguba commented Jun 11, 2025

Uh oh!

bogensch commented Jun 11, 2025

Uh oh!

mahf708 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oksanaguba commented Jun 11, 2025

Uh oh!

bogensch commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oksanaguba commented Jun 11, 2025

Uh oh!

whannah1 commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

whannah1 commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oksanaguba commented Jun 11, 2025

Uh oh!

whannah1 commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oksanaguba commented Jun 11, 2025

Uh oh!

whannah1 commented Jun 11, 2025

Uh oh!

oksanaguba commented Jun 11, 2025

Uh oh!

whannah1 commented Jun 12, 2025

Uh oh!

oksanaguba commented Jun 12, 2025

Uh oh!

oksanaguba commented Jun 13, 2025

Uh oh!

mahf708 commented Jun 13, 2025

Uh oh!

whannah1 commented Jun 13, 2025

Uh oh!

oksanaguba commented Jun 13, 2025

Uh oh!

bogensch commented Jun 16, 2025

Uh oh!

whannah1 commented Jun 17, 2025

Uh oh!

whannah1 commented Jun 17, 2025

Uh oh!

Uh oh!

whannah1 commented Jun 11, 2025 •

edited by mahf708

Loading

mahf708 left a comment •

edited

Loading

mahf708 left a comment •

edited

Loading

bogensch commented Jun 11, 2025 •

edited

Loading

whannah1 commented Jun 11, 2025 •

edited

Loading

whannah1 commented Jun 11, 2025 •

edited

Loading

whannah1 commented Jun 11, 2025 •

edited

Loading