Skip to content

Conversation

w-k-jones
Copy link
Member

@w-k-jones w-k-jones commented Mar 26, 2025

This PR has expanded to cover a number of remaining issues with iris/xarray compatibility in RC_v1.6.0:

Datetime conversion:

Feature times are always converted to cftime if not using min_distance parameter with 2d data, and add_coordinates is called multiple times. This was likely caused by a merge conflict, when add_coordinates was moved to before distance filtering the preserve_iris_datetime_types parameter was missing.

https://github.com/tobac-project/tobac/blame/f05da214511da70e0ee54dcdec892d08f7f8f6be/tobac/feature_detection.py#L1423

This PR fixes this bug, removes the duplicate add_coordinates call and adds test for various datetime outputs.

I also fixed test_feature_detection_coords, as it turned out that this wasn't actually testing anything as the input dataset had no coords...

Coordinate naming:

Implements the previously unused use_standard_name parameter for coordinate interpolation in feature detection. Now the default behaviour is to use standard names for iris cube input and variable names for xarray dataarrays, but either can be chosen by the user by setting use_standard_name to True or False.

This resulted in a couple of issues, as a number of utils/analysis functions in tobac currently expect dataframe columns to always have standard names (e.g. projection_x_coordinate). To avoid this, I added new functionality to find horizontal coordinate names from dataframes in the same manner as the existing find_dataframe_vertical_coord

Internal utils

At some point during RC_v1.6.0, tobac.utils.general_utils was added to the branch along with tobac.utils.basic. These contained both duplicate functions to each other, as well as duplicate, outdated versions of functions now in other modules. I have resolved this issue, and also split basic into two modules coordinates and label_props to clarify what these module are used for

  • Have you followed our guidelines in CONTRIBUTING.md?
  • Have you self-reviewed your code and corrected any misspellings?
  • Have you written documentation that is easy to understand?
  • Have you written descriptive commit messages?
  • Have you added NumPy docstrings for newly added functions?
  • Have you formatted your code using black?
  • If you have introduced a new functionality, have you added adequate unit tests?
  • Have all tests passed in your local clone?
  • If you have introduced a new functionality, have you added an example notebook?
  • Have you kept your pull request small and limited so that it is easy to review?
  • Have the newest changes from this branch been merged?

@w-k-jones w-k-jones added bug Code that is failing or producing the wrong result Small Change A minor change which should be quick to address and review xarray transition Part of the transition to xarray support labels Mar 26, 2025
@w-k-jones w-k-jones added this to the Version 1.6 milestone Mar 26, 2025
@w-k-jones w-k-jones self-assigned this Mar 26, 2025
Copy link

codecov bot commented Mar 26, 2025

Codecov Report

Attention: Patch coverage is 99.54955% with 1 line in your changes missing coverage. Please review.

Project coverage is 63.52%. Comparing base (f05da21) to head (d231cb3).

Files with missing lines Patch % Lines
tobac/utils/internal/coordinates.py 98.48% 1 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           RC_v1.6.0     #489      +/-   ##
=============================================
+ Coverage      59.14%   63.52%   +4.38%     
=============================================
  Files             25       27       +2     
  Lines           3892     3836      -56     
=============================================
+ Hits            2302     2437     +135     
+ Misses          1590     1399     -191     
Flag Coverage Δ
unittests 63.52% <99.54%> (+4.38%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

github-actions bot commented Mar 26, 2025

Linting results by Pylint:

Your code has been rated at 8.21/10 (previous run: 8.21/10, +0.00)
The linting score is an indicator that reflects how well your code version follows Pylint’s coding standards and quality metrics with respect to the RC_v1.6.0 branch.
A decrease usually indicates your new code does not fully meet style guidelines or has potential errors.

@w-k-jones
Copy link
Member Author

w-k-jones commented Mar 26, 2025

@freemansw1 @JuliaKukulies @kelcyno This fixes the bug with datetime conversions in feature detection that I mentioned earlier, hopefully it should be a quick change to review :)

edit: Fixing the datetime issues found in #490 ended up being a bigger job. I've created two new utils submodules:

  • datetime handles datetime conversion functions used to transform between numpy, pydatetime, pandas and cftime datetimes which should make these issues easier to handle in future
  • generators includes (for now) a single generator used to iterate through the time slices of a dataarray and find the matching features. It replaces the two different implementations previously in segmentation and bulk statistics, and should be more resilient to different datetime formats

In addition, datetime conversions are fully handled in the iris to xarray conversion decorator now, rather than being sorted by the coordinate interpolation function, which helps ensure that time coordinates remain in the same format as the input

@w-k-jones w-k-jones removed the Small Change A minor change which should be quick to address and review label Mar 28, 2025
…and set the default to be True for iris and False for xarray
@w-k-jones w-k-jones marked this pull request as ready for review March 30, 2025 17:12
@w-k-jones
Copy link
Member Author

Right, I think all the issues that have cropped up have been resolved and that this is now ready for review. Hopefully this should make the v1.6 release more stable and make future development easier

Copy link
Member

@freemansw1 freemansw1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work, @w-k-jones. Honestly, that datetime utility would be extremely useful for the broader community, not even just tobac internally.

… and hdim_2 had to be provided even if only calculating PBCs over one dimension
@w-k-jones w-k-jones mentioned this pull request Apr 7, 2025
11 tasks
@w-k-jones w-k-jones requested a review from fsenf April 11, 2025 14:29
@freemansw1
Copy link
Member

Happy with the new changes!

@fsenf
Copy link
Member

fsenf commented Apr 15, 2025

@w-k-jones : Great development!

From the first impression I noticed that some docstrings are missing:

  1. could you add short descriptive texts to your new tests consistent with the other tests.
  2. complete numpydoc docstrings are needed in the datetime.py module.

Could you update everything accordingly? Thanks!

@w-k-jones
Copy link
Member Author

Thanks for flagging this @fsenf , I'll work on fixing those asap

@fsenf
Copy link
Member

fsenf commented Apr 17, 2025

@w-k-jones : Do you need support for finishing the docstrings?

@w-k-jones
Copy link
Member Author

@fsenf I should have addressed your points now

Copy link
Member

@fsenf fsenf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great achievement! I am happy with the PR!

@w-k-jones : Please go ahead and merge.

@w-k-jones w-k-jones merged commit 4de8d3f into tobac-project:RC_v1.6.0 Apr 22, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Code that is failing or producing the wrong result xarray transition Part of the transition to xarray support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants