-
Notifications
You must be signed in to change notification settings - Fork 435
EAMxx: add binary ops diags #7573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is needed soon-ish, then this is ok. I think long term we may want (or maybe not, up for debate) a more broad/robust approach. In particular, I think people may want to compute semi-arbitrary expressions, in which case we may replace this with a more generic "expression-parser" kind of diagnostic (perhaps incorporating some existing TPL).
I also recommend that this be merged after the output field aliasing PR, so we can rename fields in the nc file and avoid +-*/ symbols in var names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is needed soon-ish, then this is ok. I think long term we may want (or maybe not, up for debate) a more broad/robust approach. In particular, I think people may want to compute semi-arbitrary expressions, in which case we may replace this with a more generic "expression-parser" kind of diagnostic (perhaps incorporating some existing TPL).
I also recommend that this be merged after the output field aliasing PR, so we can rename fields in the nc file and avoid +-*/ symbols in var names.
70f9b6d
to
644ffe4
Compare
|
531025f
to
0d02d71
Compare
|
||
The resulting diagnostic field will have units determined by the operation. | ||
|
||
## Configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that it is available, should we remind the user about aliasing, and "encourage" to use it for binary-op diags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, I would keep the docs for each feature separate. I will write an overarching docs page for tips to use them together, but I think each page should be isolated to what it is offering as a standalone feature
0d02d71
to
516b710
Compare
std::regex field_at_l (R"(([A-Za-z0-9_.+-]+)_at_(lev_(\d+)|model_(top|bot))$)"); | ||
std::regex field_at_p (R"(([A-Za-z0-9_.+-]+)_at_(\d+(\.\d+)?)(hPa|mb|Pa)$)"); | ||
std::regex field_at_h (R"(([A-Za-z0-9_.+-]+)_at_(\d+(\.\d+)?)(m)_above_(sealevel|surface)$)"); | ||
// Start with a generic for a field name allowing for all letters, all numbers, dash, dot, plus, minus, product, and division |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bartgol pls re-review this file
// Start with a generic for a field name allowing for all letters, all numbers, dash, dot, plus, minus, product, and division | ||
// Escaping all the special ones just in case | ||
std::string generic_field = "([A-Za-z0-9_.+\\-\\*\\÷]+)"; | ||
std::regex field_at_l (R"()" + generic_field + R"(_at_(lev_(\d+)|model_(top|bot))$)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the leading "()" for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To ensure the std::string gets converted to the proper stuff? Would it work without it? I didn't test ... that's why I wanted you to review this before merging 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::regex constructor takes a string, and generic_field is already a std::string (otherwise the + operator would not work). The reason why we need the R in other strings is to prevent stuff like \n
to be escaped. And speaking of, you can just do
std::string generic_field = "([A-Za-z0-9_.+-*÷]+)";
no need to escape characaters inside []
. And since you don't have fancy chars, no need for R. Btw, why ÷
instead of /
? It's not a standard char, so maybe not what ppl would use...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Not escaping
*
and/or÷
(or maybe it was having stuff after un-escaped-
) caused issues in other tests (io_remap, io_diags, etc.). I can bring it back to see the errors if you want... /
is not supported in netcdf writing;÷
is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm, ÷
is not ASCII though, which may be undesirable...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should replace symbols with "words" in var names. E.g.,
- yaml request
f_1+f_2
saved asf_1_plus_f_2
- yaml request
f_1-f_2
saved asf_1_minus_f_2
- yaml request
f_1*f_2
saved asf_1_times_f_2
- yaml request
f_1/f_2
saved asf_1_over_f_2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- yaml request
f_1+f_2
saved asf_1_plus_f_2
- yaml request
f_1-f_2
saved asf_1_minus_f_2
- yaml request
f_1*f_2
saved asf_1_times_f_2
- yaml request
f_1/f_2
saved asf_1_over_f_2
I am not a fan of changing what the request is.
I am okay with f_1_plus_f_2
, etc. --- should I just switch that? I don't wanna overcomplicate stuff, so we should just pick one or another. The plus, minus, times, over terminology is ok. We can also do, addop, subop, prodop, divop etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you want the name in the yaml to match the name in the nc file? That's reasonable. I don't have a strong pref between addop and plus, so you can pick there. But I think sticking with words may be best long term (I am already fearing some postproc script choking on a + or -...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, unlesss someone asks for the alias feature (also remember, in the aliasing feature, we document the name in the attrs). Already, will ping you on slack for one last call on plus vs addop
## Caveats | ||
- Strictly speaking, multiple operations will take place in order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we note that we don't support parentheses? So no _(_qc_+_qv_)_*_p_mid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second question, do we have a test for this case? I may have missed it, but I think the test just does binary ops for 2 vars. It seems like 3 vars is a good edge case, and sense you mention it as an example it would be good to test it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, both are good suggestions. I can include them or in a follow-up PR (that will combine docs for all features).
I will also need to clarify very clearly the order of ops by giving a few examples...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AaronDonahue, I will add the composability test later. The composability comes from the parsing (in eamxx IO utils), not the diag impl itself. This PR is mostly about the diag impl. I added some notes, but I will strengthen the testing later.
mathematical operation precedence; it is simply about the parser | ||
which processes the expression left-to-right.) | ||
- In fact, `p_mid_times_qc_plus_qv` will fail because of units! | ||
- We only support existing fields in the Field Manager, e.g., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
We only support existing fields in the Field Manager (and other diagnostics). e.g. ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to go. I had a docs comment, but I think you're going to work on a diags (docs) overhaul, so it can wait.
A thought: it'd be nice if we allowed one of the two operand to be a known constant. E.g., one could request f1_times_rho_w
or something like that. But we can think about this later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
#7573) Add some binary arithmetics for EAMxx diags [BFB]
add some binary arithmetics for EAMxx diags
[bfb]