-
Notifications
You must be signed in to change notification settings - Fork 434
EAMxx: add zonal average diagnostic and update IO for custom CMP-like tag names #7261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks quite useful. I have one question: do you really care about having a new FieldTag (BIN
) or do you just care about the name it gets in the output nc file?
If you are ok with using the existing CMP
(with a different string attached, like "bin"), I can quickly add a PR (or push to your branch, whatever you prever), so that we can do that.
Edit: I'm not opposed to add a new tag BIN
, since it carries a different meaning than CMP
. If we keep BIN
though, we need to be a bit more careful with the LayoutType calculation, since it may impact downstream remappers. If we piggy back on CMP
+"bin", we can remove the LayoutType issue. Either way works for me.
@@ -74,3 +74,6 @@ CreateDiagTest(atm_backtend "atm_backtend_test.cpp") | |||
|
|||
# Test horizontal averaging | |||
CreateDiagTest(horiz_avg "horiz_avg_test.cpp") | |||
|
|||
# Test zonal averaging | |||
CreateDiagTest(zonal_avg "zonal_avg_test.cpp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the non-local nature of the diag, we should prob test for 2+ ranks as well:
MPI_RANKS 1 ${SCREAM_TEST_MAX_RANKS})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be addressed in b610ba4
// TODO: doing cuda-aware MPI allreduce would be ~10% faster | ||
Kokkos::fence(); | ||
result.sync_to_host(); | ||
comm->all_reduce(result.template get_internal_view_data<Real, Host>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so we're clear, so far this diagnostic will NOT be PE-invariant. That said, @oksanaguba is working on implementing bridges to the repro_sum f90 module, so we can revisit this later. The horiz_avg diag already suffers from the same issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense: the horiz_avg diag is the one I modeled this diag after. I'll keep an eye out for those bridges!
I'm thinking your CMP+"bin" suggestion is the way to go for now... I suppose if there ends up being a lot of uses of CMP+"bin" in other diagnostics, that could be a good time to introduce BIN. Thanks for the quick review! |
I have already worked through something similar as a warm up for trying to introduce a tag, so while I appreciate the offer, I think I can hammer that out quickly! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I would consider moving the compute_zonal_avg
fcn to the field_utils.hpp
method. Feel free to disregard though.
|
||
class ZonalAvgDiag : public AtmosphereDiagnostic { | ||
|
||
// TODO: comment this, noting it's a utility function that could exist elsewhere |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may consider moving this fcn into share/field/field_utils.hpp
. That's where we also have horiz_contraction
, and other utilities. Not sure if we will ever use it outside of this diag class, but since it has no dependency on the class internals (it's a static method), we may as well put it out there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up change it to a local function in #7481 per the suggestions in a few spots of this PR (e.g. #7261 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things (one easy to fix, the other speculative and more involved)
-
This is currently not requestable, right? How would one get the zonal average of T_mid? You will need to add the appropriate regex in components/eamxx/src/share/io/eamxx_io_utils.cpp
-
Is there urgency to make use of this diagnostic impl? If not, let's reconsider the overarching design. Obviously, calculating the zonal average on a structured lat–lon grid is much easier and a much more straightforward reduction. We have been thinking about adding support for structured grids for remapping purposes (e.g., see #7198). If there's no hurry to do this implementation, I'd rather us consider doing this via a remap to an appropriate (tbd, customizable) lat–lon grid, then doing a simple reduction of the resulting remapped field on the LON dim. Generally speaking, most people end up remapping the output to structured grids anyway, and we might as well support that. If we go this route, we will not need the bin dimensionality. Ultimately, while workable, I think a zonal average on unstructured (i.e., not lat–lat) data is a conceptually flawed reduction (imho).
Please fix number 1 and let us know what you think about number 2
I think remapping to a lat-lon grid, and then doing a reduction would be conceptually the same thing though. Not only that, it would require two rounds of MPI, one for the sparse mat-vec for pg2->latlon, and one to globally reduce the zonal band. |
@mahf708 , that is correct. I did add the regex in ed34800, but there is now a conflict with |
Kinda yes, I think we all prefer rebasing :) |
ed34800
to
8bd8be0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still like to hear your thoughts about my point 2 above.
Also, to double check, have you tested how this looks like once outputted in a netcdf file to make sure you're happy with how it looks? No worries if not
[edit: undraft/draft to get the ci to run]
8bd8be0
to
773ff5b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you'd like to address the linter, feel free to do so, but if I were you, I would only address (i.e., i'd leave the older files for now)
components/eamxx/src/diagnostics/tests/zonal_avg_test.cpp
components/eamxx/src/diagnostics/zonal_avg.hpp
components/eamxx/src/diagnostics/zonal_avg.cpp
Also, would you be willing to rebase to make this into a few coherent commits?
components/eamxx/tests/multi-process/dynamics_physics/homme_shoc_cld_p3_rrtmgp_pg2/output.yaml
Outdated
Show resolved
Hide resolved
using ESU = ekat::ExeSpaceUtils<typename KT::ExeSpace>; | ||
switch(result_layout.rank()) | ||
{ | ||
case 1: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should add support for rank-0 fields as well? If one wants to do a zonal avg of a horiz slice of a 2d quantity, it' be much better to do the slice first, then the zonal avg, since on a 2d field we actually would get coalesced access on GPU in this diagnostic (all other ranks suffer from bad GPU access, but there's not much to do about it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got confused. The output field will ALWAYS have rank>=1, since it has the number of bins as a dimension. Hence, disregard my comment.
|
||
namespace scream { | ||
|
||
void ZonalAvgDiag::compute_zonal_sum(const Field &result, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this diagnostic could be made much more perf by pre-computing the list of cols that belong to each lat bin. Then, at runtime do
pfor(nbins) {
auto ibin = policy.league_rank
preduce(MDPolicy(bin_size,dim1,dim2,..),[&](int idx, idim, jdim,...,Real& accum){
auto icol = bin_cols(idx)
accum += f(icol,idim,jdim,...);
},bin(ibin,idim1,idim2,..));
}
Notice that this would maintain coalesced access on GPU, without adding a race condition in the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is now, the non-coalesced access, combined with the recalculation of what's in each bin every time, probably gives mildly bad perf on GPU.
We can integrate as is, if you want, but we should make a priority to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay. I'm unclear on how the parallel reduce can be over the non-column dimensions when there is a separate zonal average for each of the non-column dimensions. Should perhaps the pfor
in your suggestion use a MDPolicy
over the bins and non-column dimensions? Or maybe I'm missing something straightforward here?
Perhaps a less abstract version of the question, is that I dont see how the parallel reduce would be able to write to bin(ibin, idim, jdim,...)
when idim
and jdim
are local to the lambda function for the reduction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My snippet is wrong, since MDPolicy only works for outermost loops. We'd have to use a TeamVectorRange, and do some index arithmetics manually. But MDP vs TVR is not the point anyways.
The point was to have this pseudocode:
loop_over_num_bins
par_reduce (over bin_size)
icol = get-current-index-within-bin
pfor (over any remaining dim)
accum(<other-dims>) += f(icol,<other-dims>)
I am not entierly sure we can achieve this, since it would require accum to potentially be an Ndim array. But this 3-level hierarchy would give coalesced access on GPU. I have to check with kk devs if this can be achieved.
Regardless, I think it's fine to merge without worrying about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense. I appreciate the explanation.
I have been working on pre-computing which columns are in each zonal band (so each thread team only iterates across the columns in the respective zonal band). Assuming this is still worth adding to the PR (instead of the current approach of having each thread team iterate over all the columns) is there an appropriate data structure to store the column indices? I have it implemented as a simple vector of vectors... but I assume there's a better way and am still coming up to speed with memory management in Kokkos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can ignore my last request: I'm now seeing a Field can have an int**
view... which seems to be the way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, I think you can merge, and then improve. But of course, improving and then merging is perfectly fine.
As for the index indirection, you could use a 2d view along with a 1d small view:
bin_size(num_bins)
: a 1d view wherebin_size(ibin)
is the number of cols in binibin
.col_lids(num_bins,max_bin_sz)
: a 2d wherecol_lids(ibin,k)
is the local id of the k-th col in this bin.
This would require a 2 phases setup: a first pass to count the size of each bin (and create bin_size), and a second pass to get the lids of each col in each bin. This would have a bit more storage than a vector-of-vectors (or in this case a view-of-views), but considering all the other storage requirements in eamxx, it may not be too bad.
A view of views would also work though. We do use that in Hommexx's boundary exchange structures, so you can def go down that route. Just be sure to cleanup all the inner views when the diag is destroyed, as the outer view destructor does not properly destroy the inner ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions! For now, I'll address the linting requests and rebase requests so this can be merged, with the plan to add those improvements in the next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a pass at this in #7481. I'm open to any and all suggestions as I'm just getting my head around Kokkos in general.
…ieldLayout::add to avoid duplication of code between append and prepend
…iagnostic for EAMxx
…m_shoc_cld_p3_rrtgmp_pg2
@cjvogl, code doesn't compile; let us know if you'd like help fixing stuff (pretty ez to test locally or on pm-gpu if you'd like, see somewhere under https://docs.e3sm.org/E3SM/EAMxx/developer/dev_quickstart). Shouldn't be too hard to fix, but obvz can be pretty tedious if you don't have a setup working on a variety of machines, etc. otherwise, looks good 🙌 |
// - the first dimension for field, weight, and lat is for the columns (COL) | ||
// - the first dimension for result is for the zonal bins (CMP,"bin") | ||
// - field and result must be the same dimension, up to 3 | ||
static void compute_zonal_sum(const Field &result, const Field &field, const Field &weight, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a method at all? Can it just be a local function in the cpp file and be done with it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in #7481
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to address the dim name issue that Naser identified. After that, I think this is good to be merged. We can address the issue of how to handle tags that don't have a fixed length associated with them at a later stage. I have to think about how to best do it.
346bff5
to
23a74df
Compare
|
@@ -140,7 +140,7 @@ create_diagnostic (const std::string& diag_field_name, | |||
std::regex vert_layer ("(z|geopotential|height)_(mid|int)$"); | |||
std::regex horiz_avg ("([A-Za-z0-9_]+)_horiz_avg$"); | |||
std::regex vert_contract ("([A-Za-z0-9_]+)_vert_(avg|sum)(_((dp|dz)_weighted))?$"); | |||
std::regex zonal_avg (R"(([A-Za-z0-9_]+)_zonal_avg_with_(\d+)_bins$)"); | |||
std::regex zonal_avg (R"(([A-Za-z0-9_]+)_zonal_avg(_with_(\d+)_bins)?$)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any last-minute thoughts about the naming here? I am not sure how I feel about "with X bins"
Maybe X_zonal_avg_in_Y_lats?
Also, we don't make the "lat" available and instead we rely on users to figure it out. Tha'ts ok, but we should likely add a helper string in the output ... lemme know what you think and I will address it (here or later)
Edit: I think it may make sense to rename "binN" to "latN" and then we can endow this dim with its actual values (the index is there automatically). But can do later, not necessarily now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote for "bandN" or "zoneN". "latN" seems to suggest a single latitude value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a different note: why is _with_
part of the captured group? I don't think using a default number of bins is a great idea, since ppl may forget to set that. I'd rather error out right away, and have users fix the diag name to specify the number of bins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to do a "default" zonal decomp, maybe we should use more "meaningful" (and not uniform) bounds, to get something like 2x polar zones, 2x temperate zones, and 1-2 tropical zones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just followed the description of the PR. I am fine with forcing a no-default and erroring out. So, let's see what @cjvogl prefers and we can tweak it.
If you want to do a "default" zonal decomp, maybe we should use more "meaningful"
(and not uniform) bounds, to get something like 2x polar zones, 2x temperate zones, and 1-2 tropical zones?
can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, doing 180 bins as default seems very arbitrary, and also maybe too little of a size for bin 1 and 180... I was thinking maybe something like
- polar N: [66,90]
- temperate N: [35,66.5]
- sub-tropical N: [23.5,35]
- tropical: [-23.5,23.5]
- sub-tropical S: [-35,-23.5]
- temperate S: [-66.5,-35]
- polar S: [-90,-66.5]
They are still somewhat arbitrary, but probably more linked to relevant regions of analysis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's often the case people tend to look at zonal means like designed here, e.g., e3sm_diags zonal plots and zonal-annual plots. The default of 180 (one-lat-deg spacing) is pretty reasonable if we want to keep a default. However, I am fine with getting rid of the defaults, and forcing users to specify one.
What you're listing (polar, mid-lat, subtrop, trop, etc.) is a derivative of the more general zonal average stuff, but it is more tightly related to horiz avg. I would consider adding those regions as special cases of horiz_avg instead of here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would vote for "bandN" or "zoneN". "latN" seems to suggest a single latitude value.
The motivation behind bin is that one can think of a zonal average as a histogram, and thus a name like "bin" could be reused if someone wants to have a histogram in a different dimension than along latitude. As a concrete example, we intend to introduce a diagnostic where one can output a histogram in the vertical direction to reduce the output size from a high resolution run.
The default of 180 (one-lat-deg spacing) is pretty reasonable if we want to keep a default.
The one-lat-deg spacing was the original idea, but I was not beholden to it (or any default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense. But the [-90,-89) band (e.g.) will have much fewer cols than the [-1,0) band, so is the histogram meaningful there? But that's beyond the scope, i am ok with bin as a name. I will integrate this as soon as testing is done.
…7261) Adds a zonal average average diagnostic (zonal_avg), where the given field is averaged in the zonal direction over uniform latitude ranges. The user can specify how many ranges to use via the num_lat_vals parameter. The diagnostic uses the CMP tag with name "bin", so I/O is updated to specifying dimension name binN where N is the number of bins. [BFB]
adds a zonal average average diagnostic (
zonal_avg
), where the given field is averaged in the zonal direction over uniform latitude ranges. The user can specify how many ranges to use via thenum_lat_vals
parameter.The diagnostic uses the CMP tag with name "bin", so I/O is updated to specifying dimension name
binN
whereN
is the number of bins.[BFB]