-
Notifications
You must be signed in to change notification settings - Fork 4
DAS-2278 - Handle spatial subsetting in products with all fills in lat/lon coordinates #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Joey, here's a bunch of comments, I think you're on the right track and like a lot of this. feel free to respond inline or reach out and we can talk through anything. I'll approve the analysis ticket now. and leave this as a comment on the draft pr
hoss/coordinate_utilities.py
Outdated
column_dimensions = [ | ||
col_row_to_xy(geotranform, i, 0) for i in range(lat_arr.shape[1]) | ||
] | ||
row_dimensions = [col_row_to_xy(geotranform, 0, i) for i in range(lat_arr.shape[0])] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably a massive nit. but can you use different temp vars in your comprehensions? I would use i, and j and follow the standard conventions, or I'd probably just use row and col.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should go look and see if this was in my code too :awkwardsockmonkey:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also just so I understand, this routine doesn't need lat_arr, an array of the latitudes, it just needs the shape of that variable? is there a way to get that without reading the whole array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't a current way I'm aware of to get the shape of the variable without reading the whole array. DAS-2287 is addressing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use 'row' and 'col' in 7d5e34c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
hoss/coordinate_utilities.py
Outdated
# pull out dimension values | ||
x_values = np.array([x for x, y in column_dimensions], dtype=np.float64) | ||
y_values = np.array([y for x, y in row_dimensions], dtype=np.float64) | ||
projected_y, projected_x = tuple(projected_dimension_names) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing you copied this from the other code, but why not use * notation?
of if you know that projected_dimension_names is alway 2 values, just do direct unpacking
projected_y, projected_x = projected_dimension_names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, updated in 7d5e34c
hoss/hoss_config.json
Outdated
"Mission": "SMAP", | ||
"ShortNamePath": "SPL3FT(P|P_E)", | ||
"ShortNamePath": "SPL3FTP", | ||
"VariablePattern": "(?i).*polar.*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may need to be
"VariablePattern": "(?i).*polar.*" | |
"VariablePattern": "(?i).*Polar.*" |
Well, I look dumb, the (?i) is the case insensitive flag..
That aside, if this is only for SPL3FTP, the group is defined as Freeze_Thaw_Retrieval_Data_Polar
and I'm assuming (from looking at the file) the variables don't have the name polar except in their fully qualified path.
e.g.
Variable full name: Freeze_Thaw_Retrieval_Data_Polar/open_water_body_fraction
So why not just be explicit and make that
"VariablePattern": "Freeze_Thaw_Retrieval_Data_Polar.*"
Would that work and be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this would make it more clear. I've updated this in 7d5e34c. A leading slash is required so I decided to use:
"VariablePattern": "/Freeze_Thaw_Retrieval_Data_Polar/.*"
hoss/hoss_config.json
Outdated
"Applicability": { | ||
"Mission": "SMAP", | ||
"ShortNamePath": "SPL3FTP_E", | ||
"VariablePattern": "(?i).*polar.*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as before
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 7d5e34c
|
||
if cf_attributes: | ||
crs = CRS.from_cf(cf_attributes) | ||
return cf_attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at this and going to complain that the complexity is a little out of hand, but I see that none of it was you... So this seem fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought for optimization if grid_mapping_variable.attributes and varinfo.get_missing_variable_attributes() always return value
if grid_mapping_variable is not None:
return grid_mapping_variable.attributes
check for any overrides
return varinfo.get_missing_variable_attributes(grid_mapping)
Remove the if cf_attributes:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine now.
hoss/projection_utilities.py
Outdated
def get_variable_crs(cf_attributes: str) -> CRS: | ||
"""Create a `pyproj.CRS` object from the grid mapping variable metadata | ||
attributes. | ||
""" | ||
return CRS.from_cf(cf_attributes) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if instead, you left this signature as it was, and moved a call to get_grid_mapping_attributes into here, then you wouldn't need to make two calls below and you've still exposed a function to get the cf_attributes.
And then actually, you could have another get_master_geotransform() function you could use down near L259 where you are switching how you get your dimension arrays.
let me know if that makes sense or sounds stupid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 1b23ebd
hoss/spatial.py
Outdated
crs, | ||
projected_dimension_names, | ||
) | ||
if 'master_geotransform' in grid_mapping_attributes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we talked in a huddle about this. I'm seeing that it's used specifically for getting ranges, and this is sorta at the level of thought process in the function, so I don't think it buys you anything to bury this in another function just to hide this if statement.
I do still think you might have a dedicated get_master_geotransform() function that would hide all calls to get_grid_mapping_attributes from this level of abstraction (like I mentioned above) Also why not primary_geotransform
instead of master_geotransform
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to implement a dedicated get_master_geotransform() function as you suggest. As for master_geotransform
vs primary_geotransform
I went with the naming suggested by @D-Auty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI - "master geotransform" here refers to the notion that the geotransform for a "whole earth" grid - e.g., what is defined in the GPD files, and for the EASE-GRID standard grids. A geotransform as a granule attribute by itself could be considered redundant with the dimension variables - which provide the "coordinate" in meters for the data arrays, but is specific to the granule and the array sizes. A "master geotransform" would be a collection level attribute, applicable across many granules of different sizes (e.g. tiles) and likely, many collections even. Hopefully the reference of master geotransform avoids the confusion with the specific extents of the granule itself. It seemed that master was a better reference than primary in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 1b23ebd
"Applicability": { | ||
"Mission": "ICESat2", | ||
"ShortNamePath": "ATL0[3-9]|ATL1[023]", | ||
"ShortNamePath": "ATL03", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, Joey.
Description
These changes provides an alternative way to create the dimension scales that are used for spatial subsetting. For SMAP L3 Polar grid variables in (SPL3SMP_E, SPL3FTP, SPL3FTP_E, SPL3FTA) where creating dimension scales from its associated coordinate variables was currently failing, a "master geotransform" attribute has been added to its grid mapping configuration entry. Function changes were made to use this attribute when it is available to create the dimension scales.
Jira Issue ID
DAS-2278
Local Test Steps
Checkout this branch, build the docker images and run the test image.
Ensure the tests pass
Follow the instructions in the associated das-collections-harmony-testing PR and ensure the notebooks succeed.
PR Acceptance Checklist
CHANGELOG.md
updated to include high level summary of PR changes.docker/service_version.txt
updated if publishing a release.