Skip to content

Conversation

mahf708
Copy link
Contributor

@mahf708 mahf708 commented Aug 4, 2025

add some binary arithmetics for EAMxx diags

[bfb]

@mahf708 mahf708 requested a review from bartgol August 4, 2025 15:01
@mahf708 mahf708 added the EAMxx Issues related to EAMxx label Aug 4, 2025
Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is needed soon-ish, then this is ok. I think long term we may want (or maybe not, up for debate) a more broad/robust approach. In particular, I think people may want to compute semi-arbitrary expressions, in which case we may replace this with a more generic "expression-parser" kind of diagnostic (perhaps incorporating some existing TPL).

I also recommend that this be merged after the output field aliasing PR, so we can rename fields in the nc file and avoid +-*/ symbols in var names.

Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is needed soon-ish, then this is ok. I think long term we may want (or maybe not, up for debate) a more broad/robust approach. In particular, I think people may want to compute semi-arbitrary expressions, in which case we may replace this with a more generic "expression-parser" kind of diagnostic (perhaps incorporating some existing TPL).

I also recommend that this be merged after the output field aliasing PR, so we can rename fields in the nc file and avoid +-*/ symbols in var names.

@mahf708 mahf708 force-pushed the mahf708/eamxx/some-binary-arithmetic-diags branch from 70f9b6d to 644ffe4 Compare August 5, 2025 14:26
@mahf708 mahf708 added the BFB PR leaves answers BFB label Aug 5, 2025
Copy link

github-actions bot commented Aug 5, 2025

PR Preview Action v1.6.2

🚀 View preview at
https://E3SM-Project.github.io/E3SM/pr-preview/pr-7573/

Built to branch gh-pages at 2025-08-14 16:08 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@mahf708 mahf708 force-pushed the mahf708/eamxx/some-binary-arithmetic-diags branch 2 times, most recently from 531025f to 0d02d71 Compare August 11, 2025 17:54

The resulting diagnostic field will have units determined by the operation.

## Configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that it is available, should we remind the user about aliasing, and "encourage" to use it for binary-op diags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, I would keep the docs for each feature separate. I will write an overarching docs page for tips to use them together, but I think each page should be isolated to what it is offering as a standalone feature

@mahf708 mahf708 force-pushed the mahf708/eamxx/some-binary-arithmetic-diags branch from 0d02d71 to 516b710 Compare August 11, 2025 18:53
std::regex field_at_l (R"(([A-Za-z0-9_.+-]+)_at_(lev_(\d+)|model_(top|bot))$)");
std::regex field_at_p (R"(([A-Za-z0-9_.+-]+)_at_(\d+(\.\d+)?)(hPa|mb|Pa)$)");
std::regex field_at_h (R"(([A-Za-z0-9_.+-]+)_at_(\d+(\.\d+)?)(m)_above_(sealevel|surface)$)");
// Start with a generic for a field name allowing for all letters, all numbers, dash, dot, plus, minus, product, and division
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bartgol pls re-review this file

// Start with a generic for a field name allowing for all letters, all numbers, dash, dot, plus, minus, product, and division
// Escaping all the special ones just in case
std::string generic_field = "([A-Za-z0-9_.+\\-\\*\\÷]+)";
std::regex field_at_l (R"()" + generic_field + R"(_at_(lev_(\d+)|model_(top|bot))$)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the leading "()" for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure the std::string gets converted to the proper stuff? Would it work without it? I didn't test ... that's why I wanted you to review this before merging 😅

Copy link
Contributor

@bartgol bartgol Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::regex constructor takes a string, and generic_field is already a std::string (otherwise the + operator would not work). The reason why we need the R in other strings is to prevent stuff like \n to be escaped. And speaking of, you can just do

std::string generic_field = "([A-Za-z0-9_.+-*÷]+)"; 

no need to escape characaters inside []. And since you don't have fancy chars, no need for R. Btw, why ÷ instead of /? It's not a standard char, so maybe not what ppl would use...

Copy link
Contributor Author

@mahf708 mahf708 Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Not escaping * and/or ÷ (or maybe it was having stuff after un-escaped -) caused issues in other tests (io_remap, io_diags, etc.). I can bring it back to see the errors if you want...
  • / is not supported in netcdf writing; ÷ is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, ÷ is not ASCII though, which may be undesirable...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should replace symbols with "words" in var names. E.g.,

  • yaml request f_1+f_2 saved as f_1_plus_f_2
  • yaml request f_1-f_2 saved as f_1_minus_f_2
  • yaml request f_1*f_2 saved as f_1_times_f_2
  • yaml request f_1/f_2 saved as f_1_over_f_2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • yaml request f_1+f_2 saved as f_1_plus_f_2
  • yaml request f_1-f_2 saved as f_1_minus_f_2
  • yaml request f_1*f_2 saved as f_1_times_f_2
  • yaml request f_1/f_2 saved as f_1_over_f_2

I am not a fan of changing what the request is.

I am okay with f_1_plus_f_2, etc. --- should I just switch that? I don't wanna overcomplicate stuff, so we should just pick one or another. The plus, minus, times, over terminology is ok. We can also do, addop, subop, prodop, divop etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you want the name in the yaml to match the name in the nc file? That's reasonable. I don't have a strong pref between addop and plus, so you can pick there. But I think sticking with words may be best long term (I am already fearing some postproc script choking on a + or -...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, unlesss someone asks for the alias feature (also remember, in the aliasing feature, we document the name in the attrs). Already, will ping you on slack for one last call on plus vs addop

## Caveats
- Strictly speaking, multiple operations will take place in order.
Copy link
Contributor

@AaronDonahue AaronDonahue Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we note that we don't support parentheses? So no _(_qc_+_qv_)_*_p_mid

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second question, do we have a test for this case? I may have missed it, but I think the test just does binary ops for 2 vars. It seems like 3 vars is a good edge case, and sense you mention it as an example it would be good to test it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both are good suggestions. I can include them or in a follow-up PR (that will combine docs for all features).

I will also need to clarify very clearly the order of ops by giving a few examples...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AaronDonahue, I will add the composability test later. The composability comes from the parsing (in eamxx IO utils), not the diag impl itself. This PR is mostly about the diag impl. I added some notes, but I will strengthen the testing later.

mathematical operation precedence; it is simply about the parser
which processes the expression left-to-right.)
- In fact, `p_mid_times_qc_plus_qv` will fail because of units!
- We only support existing fields in the Field Manager, e.g.,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

We only support existing fields in the Field Manager (and other diagnostics). e.g. ...

Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to go. I had a docs comment, but I think you're going to work on a diags (docs) overhaul, so it can wait.

A thought: it'd be nice if we allowed one of the two operand to be a known constant. E.g., one could request f1_times_rho_w or something like that. But we can think about this later.

Copy link
Contributor

@AaronDonahue AaronDonahue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

bartgol added a commit that referenced this pull request Aug 14, 2025
#7573)

Add some binary arithmetics for EAMxx diags

[BFB]
@bartgol bartgol merged commit 67e4d35 into master Aug 14, 2025
20 checks passed
@bartgol bartgol deleted the mahf708/eamxx/some-binary-arithmetic-diags branch August 14, 2025 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFB PR leaves answers BFB EAMxx Issues related to EAMxx
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants