Skip to content

Conversation

whannah1
Copy link
Contributor

@whannah1 whannah1 commented Jun 11, 2025

This updates the value of potential temperature exported to the surface models to be consistent with how the coupler calculates fluxes.

Fixes #7407

[non-BFB]

@whannah1 whannah1 requested review from bogensch and mahf708 June 11, 2025 16:21
@whannah1 whannah1 added bug non-BFB PR makes roundoff changes to answers. EAMxx Issues related to EAMxx labels Jun 11, 2025
@mahf708 mahf708 added the CC PR is climate changing label Jun 11, 2025
Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do a clean-up PR later related to other names in this file (as discussed with @tcclevenger). Outside scope of PR. For now, this is good to go and will achieve the desired outcome. Let's not merge until @oksanaguba approves.

@mahf708 mahf708 added bug fix PR and removed bug labels Jun 11, 2025
@rljacob
Copy link
Member

rljacob commented Jun 11, 2025

Can you provide any more detail about how this PR changes the climate?

@oksanaguba
Copy link
Contributor

I am puzzled by the choice of p0 in formula Exner=(p/p0)^kappa. AMS uses p0=1e5, https://glossary.ametsoc.org/wiki/Potential_temperature , why would we use something else? Is it going back to the idea that Exner ~= 1 at the bottom mid level? Why is it important for it to be 1? (considering that no matter what const we choose, it won't be 1 over a lot of the domain anyway).

Separately, are we sure the proposed code is consistent with coupler's code? I found Fortran definition of Exner

          phys_state(lchnk)%exner (i,k) = (phys_state(lchnk)%pint(i,pver+1) &
                                          /phys_state(lchnk)%pmid(i,k))**cappa

which I am not sure is used in eamxx. What does the coupler do in case of eamxx runs?

@oksanaguba
Copy link
Contributor

Separately, IIRC shoc and p3 will use p0=1e5, which makes the proposed definition inconsistent with physics?

@bogensch
Copy link
Contributor

This is the Exner function that EAMxx uses. It also uses p0=1e5. The hybrid_ref_pmid_bot term in Walter's PR is NOT p0, it is p.

template<typename DeviceT>
template<typename ScalarT>
KOKKOS_INLINE_FUNCTION
ScalarT PhysicsFunctions<DeviceT>::exner_function(const ScalarT& pressure)
{
using C = scream::physics::Constants<Real>;
static constexpr auto p0 = C::P0;
static constexpr auto rd = C::RD;
static constexpr auto inv_cp = C::INV_CP;
return pow( pressure/p0, rd*inv_cp );
}

@oksanaguba
Copy link
Contributor

Ok, thanks, while i am grepping more, could you possibly please provide a better description in the PR, with formulas instead of words? Also, if this is consistent with the coupler, which code in the coupler are we talking about? thanks.

@bogensch
Copy link
Contributor

It is important that Exner be 1 at the surface (that is the bottom interface level) because that is what the coupler assumes. The coupler does not have an Exner function but it assumes that the sea surface temperature (SST) absolute temperature is identical to the potential temperature. The only way to ensure this is for Exner to be 1 at the surface. Again, the potential temperature value we are passing to the coupler is NOT the surface value, but the mid-point directly above the surface. Thus Exner should be ~1 at this level and use a formulation that evaluates exactly to 1 at the surface.

Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note we need to fix the tests (spoke to Walter separately). Also, address Rob's and Oksana's concerns... requesting changes and changing to draft to get this to run once in the CI.

@mahf708 mahf708 marked this pull request as draft June 11, 2025 17:02
@oksanaguba
Copy link
Contributor

Re The coupler does not have an Exner function but it assumes that the sea surface temperature (SST) absolute temperature is identical to the potential temperature. -- would you please point me towards the code?

Separately, if the fix for Exner=1 is needed at the bottom, isn't it easier just to use that explicitly instead to trying to guess how to make it 1?

@bogensch
Copy link
Contributor

bogensch commented Jun 11, 2025

Again the value we are actually evaluating is NOT at the surface. It is the mid-point directly above. Thus we need to calculate it, we are not guessing it.

real(R8) ,intent(in) :: thbot(nMax) ! atm potential T (K)

real(R8) ,intent(in) :: ts (nMax) ! ocn temperature (K)

delth = thbot(n) - ts(n) ! Pot. temp. difference with surface (K)

The above line shows the crucial line where it takes potential temperature that we are passing to the coupler from the lowest mid-point level (again, NOT the surface) difference with SST (ts). The only way this computation is valid since SST (ts) is temperature is assuming that absolute temperature = potential temperature EXACTLY for this variable. This is from the shr_flux_atmOcn_UA scheme that EAMxx uses (though this is identical in the shr_flux_atmOcn scheme).

@oksanaguba
Copy link
Contributor

I see no issues with the code above -- variable ts (which the comment says comes from the ocean) is theta_s, as you said, because at the sea level p=p0 (not sure it is true everywhere, but it is a different question). So clearly the original code computes delta(theta), not delta(T), with the usual definition of Exner (p/p0)^kappa. Also, their paper states it is delta(theta) https://journals.ametsoc.org/view/journals/clim/11/10/1520-0442_1998_011_2628_iobaaf_2.0.co_2.xml .

I believe that you may see better climo with some changes in this code, the whole surf fluxes code may use much more attention wrt tuning, it is just not clear that defining Exner differently is the proper explanation of the changes. Also, it seems that EAM code (unintentionally?) made an inconsistent change in Exner definition. If EAM climo is the target here, would it make sense to make the same change as EAM for now instead?

@whannah1
Copy link
Contributor Author

whannah1 commented Jun 11, 2025

variable ts (which the comment says comes from the ocean) is theta_s, as you said, because at the sea level p=p0

@oksanaguba That is not a correct statement

at sea level the "real" pressure is still variable in time and space - not equal to p0

@whannah1
Copy link
Contributor Author

whannah1 commented Jun 11, 2025

Can you provide any more detail about how this PR changes the climate?

@rljacob We only have an ne256 test run by Hassan with a different, but functionally identical, version of this fix. I think those results are posted on confluence, but we should probably run a new 1-yr ne256 test.

@oksanaguba
Copy link
Contributor

at sea level the "real" pressure is still variable in time and space - not equal to p0 -- isnt it then better to change ts -> ts/Exner instead?

@whannah1
Copy link
Contributor Author

whannah1 commented Jun 11, 2025

at sea level the "real" pressure is still variable in time and space - not equal to p0 -- isnt it then better to change ts -> ts/Exner instead?

Maybe, but I think any way we change this code would require changing EAM. Peter and I talked about this and weren't sure which option made more sense. If the change to EAMv3 were non-BFB (not sure if this is avoidable or not) then we would have to undertake an extensive effort to validate it for both LR and HR configurations.

@oksanaguba
Copy link
Contributor

thanks, then could you instead adopt what EAM is doing? It is using Exner=(p/p_bot)^kappa for that code for thbot computation. it also alters surf coupling scheme, but this way we would remember that both EAM and eamxx use the same modification.

@whannah1
Copy link
Contributor Author

thanks, then could you instead adopt what EAM is doing? It is using Exner=(p/p_bot)^kappa for that code for thbot computation. it also alters surf coupling scheme, but this way we would remember that both EAM and eamxx use the same modification.

That's exactly what Naser's previous PR was meant to facilitate - but I thought you strongly argued against that approach?

@oksanaguba
Copy link
Contributor

thanks for the explanation, yes, let's proceed with Naser's original PR https://github.com/E3SM-Project/E3SM/pull/7411/files ; maybe, let's put an explanation in it about what's going on.

@whannah1
Copy link
Contributor Author

thanks for the explanation, yes, let's proceed with Naser's original PR https://github.com/E3SM-Project/E3SM/pull/7411/files ; maybe, let's put an explanation in it about what's going on.

I was actually convinced by your argument that Naser's approach is bad and that this approach is better, so I don't think we should switch back. Can you explain in more detail why you now think it's ok to allow the value of P0 to change in these rare circumstances?

Are you arguing that Naser's approach was easier to understand? Because I feel the current approach is more transparent. To be clear, both could be described as "hacks" to accommodate the quirk of the coupler flux calculations - we just need to decide which hack is "better".

@oksanaguba
Copy link
Contributor

I like Naser's approach because it is the same as in EAM v3. Both approaches, this one and EAM's, are tunings of the surf. coupling schemes. They both do not fix anything, because to fix surf. coupling, we should convert ts into theta_surf variable. This is how the original formulation in papers is done. So both approaches act as a change in some coefficient in surf. coupling parameterizations. If so, imo it is better to stick to the same approach in eamxx as in v3 instead of 2 different one.

Separately, if you are concerned about preserving v3 code, but would like to change ts into theta_s, it probably can be done with pragmas, so that v3 runs with the old version of the code, but the new version runs in scream.

@oksanaguba
Copy link
Contributor

Were yesterday's results from Hassan done with Naser's fix? did anyone compare climo from both (if it was presented yesterday, i missed that)?

@mahf708
Copy link
Contributor

mahf708 commented Jun 13, 2025

I don't think any legit evaluation of this has happened in global runs; I think it was done with a constant value as a preliminary test.

So, here's my opinion: We can go back to the EAM-style hack. I don't think my PR is the correct approach though. At first, I got the impression that it was customary to define the exner function with different references (p0 vs p_ref vs something else). I now agree with Oksana that is just wrong. There's obviously a problem in how all of this was happening in EAM/CAM, etc., including the coupler requiring the odd specs.

I see two paths to resolving this:

  • Preferably, fix this problem at the root, no matter how annoying and invasive. The essential part of this is to ensure consistent definitions across all corners of atm, including how the coupler interfaces with it. This will necessitate changes to the delth = thbot(n) - ts(n) line eventually.
  • Less preferably, implement workarounds and hacks like in EAM. In that case, I propose we just inline calculation the stuff that needs to be calculated, see below for an example.
    if (export_source(idx_Sa_ptem)==FROM_MODEL) {
-      Sa_ptem(i) = PF::calculate_theta_from_T(s_T_mid_i(num_levs-1), s_p_mid_i(num_levs-1));
+      // HACK AHEAD: add lengthy note explaining the rationale
+      Sa_ptem(i) = s_T_mid_i(num_levs-1) / pow(s_p_mid_i(num_levs-1)/s_p_int_i(num_levs), PC::RD*C::INV_CP);

@whannah1
Copy link
Contributor Author

If we do stick with the hack approach (which I'm ok with) it seems there will be a non-negligible difference between the two possible approaches because of the difference in pressures

  1. Naser's PR replacing P0 => s_p_mid_i(num_levs-1) vs s_p_int_i(num_levs)
  2. This PR replacing P => p_ref_mid(num_levs-1) vs P0

The former allows exner of the lowest layer to vary in space, while the latter effectively makes exner a constant everywhere.

I'm struggling to form an opinion on whether the impact of this choice is worth testing... In the global mean it's probably negligible, but perhaps locally it could make a notable difference, especially at higher latitudes where the pressure gradients are stronger. Both still address the issue that Peter identified.

@oksanaguba
Copy link
Contributor

Do you guys want to decide on which fix to implement using climo or using arguments about concepts? It seems there are 3 options: 1) Naser's original, 2) Walter's, and 3) trying to replace ts (without affecting v3). Imo climo is the most important argument, esp. if strong biases can be eliminated. If not using climo, I would have a slight preference towards ts fix as it is most conceptually correct. Glad we were able to discuss this all in detail, thanks.

@bogensch
Copy link
Contributor

For what it’s worth, I ran a DPxx case that’s particularly sensitive to this bug—specifically the CGILS S12 location (a coupled stratocumulus case, or “stratus” for the lay person 😉). I tested three versions: one with the current master (i.e., with the bug), one with Walter’s fix (this PR), and one with Naser’s fix (from the previously aborted PR). As expected, the results from Walter’s and Naser’s fixes are nearly identical.

Personally, I don’t think we need to spend too much time agonizing over which fix to go with. But if you put a gun to my head, I’d lean toward Naser’s PR—for many of the reasons Oksana already laid out above.

Diags from my runs are available here. At first glance, the profile plots look bit-for-bit identical, but there are some subtle differences—check out the 1D time series plots for those (just to prove I'm not accidentally plotting the same run twice!).

@whannah1
Copy link
Contributor Author

As expected, the results from Walter’s and Naser’s fixes are nearly identical.

This makes sense, glad you checked though.

Personally, I don’t think we need to spend too much time agonizing over which fix to go with. But if you put a gun to my head, I’d lean toward Naser’s PR—for many of the reasons Oksana already laid out above.

I think I'm leaning this way as well, but I don't know how to decide when the difference in the results is negligible.

@whannah1
Copy link
Contributor Author

Closing this branch to start fresh with a new PR that implement a simplified version of Naser's approach.
see PR #7441

@whannah1 whannah1 closed this Jun 17, 2025
@whannah1 whannah1 deleted the whannah/eamxx/fix-theta-cpl-bug branch June 17, 2025 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug fix PR CC PR is climate changing EAMxx Issues related to EAMxx non-BFB PR makes roundoff changes to answers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EAMxx: Passing surface coupler inconsistent potential temperature
6 participants