Skip to content

Use pooch #513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Jul 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
fdef8e9
adopt new test data management system, name helper function yangtze, …
Zeitsperre Jun 9, 2025
7677b95
add testing data registry, move common testing functions to testing.h…
Zeitsperre Jun 9, 2025
7cc651e
update notebook testdata fetching mechanism
Zeitsperre Jun 9, 2025
efbeb53
WIP - adjust tests to new fetch mechanics, no more tests package
Zeitsperre Jun 9, 2025
69ac067
test adjustments, add pooch
Zeitsperre Jun 10, 2025
677bcf7
update pre-commit, fix pydantic regression
Zeitsperre Jun 10, 2025
f1c3886
rebuild dataclass
Zeitsperre Jun 10, 2025
a5f0954
try adding caching
Zeitsperre Jun 10, 2025
1d726d3
use proper location
Zeitsperre Jun 10, 2025
c28490c
add retry logic
Zeitsperre Jun 10, 2025
cc7c156
separate cache folders
Zeitsperre Jun 10, 2025
cb51dad
explicit caching
Zeitsperre Jun 10, 2025
ca035f2
add testdata-version.yml checker
Zeitsperre Jun 10, 2025
7c1fef8
prefetch testing data, better caching
Zeitsperre Jun 11, 2025
6298228
move caching to tox
Zeitsperre Jun 11, 2025
ea627b2
Merge branch 'master' into use-pooch
Zeitsperre Jun 11, 2025
0ee4add
add prefetch
Zeitsperre Jun 11, 2025
7dfcf43
avoid race condition when downloading registry.txt
Zeitsperre Jun 12, 2025
0233553
handle case where raven binary is installed via conda and test is run…
Zeitsperre Jun 12, 2025
f9bd93b
audit coveralls connection
Zeitsperre Jun 12, 2025
1688545
support force_download option and use locking more
Zeitsperre Jun 12, 2025
6be0289
quotation marks
Zeitsperre Jun 12, 2025
9c064ab
typo fix
Zeitsperre Jun 12, 2025
0d33c72
egress policy audit
Zeitsperre Jun 12, 2025
fee9fb8
update CHANGELOG.rst, fix some typos and remove unneeded imports
Zeitsperre Jun 12, 2025
3e6d96d
use new raven-testdata tag, adjust registry.txt, adjust test expecata…
Zeitsperre Jun 12, 2025
9cc2b6c
update notebooks to be more user friendly, add missing registry entri…
Zeitsperre Jun 12, 2025
934a5b0
use regex, hide yangtze_kwargs
Zeitsperre Jun 13, 2025
2ba6afd
move convert functions, fix test
Zeitsperre Jun 13, 2025
55f0e9e
use get_file
Zeitsperre Jun 23, 2025
9b62782
Merge branch 'master' into use-pooch
Zeitsperre Jun 23, 2025
e383cf7
Merge branch 'master' into use-pooch
Zeitsperre Jul 2, 2025
dc35082
Merge branch 'master' into use-pooch
Zeitsperre Jul 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .flake8
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ ignore =
D,
E,
F,
RST210,
W503
per-file-ignores =
rst-roles =
Expand Down
83 changes: 45 additions & 38 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ on:
pull_request:

env:
RAVEN_TESTING_DATA_BRANCH: master
RAVEN_TESTDATA_BRANCH: v2025.6.12

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
Expand Down Expand Up @@ -70,27 +70,12 @@ jobs:
- name: Harden Runner
uses: step-security/harden-runner@6c439dc8bdf85cadbbce9ed30d1c7b959517bc49 # v2.12.2
with:
egress-policy: block
allowed-endpoints: >
api.github.com:443
azure.archive.ubuntu.com:80
coveralls.io:443
esm.ubuntu.com:443
files.pythonhosted.org:443
github.com:443
motd.ubuntu.com:443
objects.githubusercontent.com:443
packages.microsoft.com:443
pavics.ouranos.ca:443
pypi.org:443
raw.githubusercontent.com:443
test.opendap.org:80

disable-sudo: false
egress-policy: audit
- name: Checkout Repository
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
persist-credentials: false

- name: Set up Python${{ matrix.python-version }}
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
Expand All @@ -117,14 +102,32 @@ jobs:
- name: Install CI libraries
run: |
python3 -m pip install --require-hashes -r CI/requirements_ci.txt

- name: Environment caching (macOS)
if: matrix.os == 'macos-latest'
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
with:
path: |
.tox
~/Library/Caches/raven-testdata
key: ${{ hashFiles('src/ravenpy/testing/registry.txt') }}-${{ env.RAVEN_TESTDATA_BRANCH }}-${{ matrix.os }}
- name: Environment caching (Ubuntu)
if: matrix.os == 'ubuntu-latest'
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
with:
path: |
.tox
~/.cache/raven-testdata
key: ${{ hashFiles('src/ravenpy/testing/registry.txt') }}-${{ env.RAVEN_TESTDATA_BRANCH }}-${{ matrix.os }}

- name: Test with tox and report coverage
run: |
if [ "${{ matrix.tox-env }}" != "false" ]; then
python3 -m tox -e ${{ matrix.tox-env }}
python3 -m tox -e ${{ matrix.tox-env }}-prefetch
elif [ "${{ matrix.python-version }}" != "3.13" ]; then
python3 -m tox -e py${{ matrix.python-version }}-coverage
python3 -m tox -e py${{ matrix.python-version }}-prefetch-coverage
else
python3 -m tox -e py${{ matrix.python-version }}
python3 -m tox -e py${{ matrix.python-version }}-prefetch
fi
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Expand All @@ -151,18 +154,7 @@ jobs:
uses: step-security/harden-runner@6c439dc8bdf85cadbbce9ed30d1c7b959517bc49 # v2.12.2
with:
disable-sudo: true
egress-policy: block
allowed-endpoints: >
api.github.com:443
conda.anaconda.org:443
coveralls.io:443
files.pythonhosted.org:443
github.com:443
objects.githubusercontent.com:443
pavics.ouranos.ca:443
pypi.org:443
raw.githubusercontent.com:443
test.opendap.org:80
egress-policy: audit
- name: Checkout Repository
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
Expand All @@ -187,6 +179,25 @@ jobs:
run: |
micromamba list
python -m pip check || true
- name: Cache test data (macOS)
if: matrix.os == 'macos-latest'
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
with:
path: |
~/Library/Caches/raven-testdata
key: ${{ hashFiles('src/ravenpy/testing/registry.txt') }}-${{ env.RAVEN_TESTDATA_BRANCH }}-conda-${{ matrix.os }}
- name: Cache test data (Ubuntu)
if: matrix.os == 'ubuntu-latest'
uses: actions/cache@5a3ec84eff668545956fd18022155c47e93e2684 # v4.2.3
with:
path: |
~/.cache/raven-testdata
key: ${{ hashFiles('src/ravenpy/testing/registry.txt') }}-${{ env.RAVEN_TESTDATA_BRANCH }}-conda-${{ matrix.os }}

- name: Prefetch RavenPy test data
run: |
python -c "import ravenpy.testing.utils as rtu; rtu.populate_testing_data()"

- name: Test RavenPy
run: |
python -m pytest --numprocesses=logical --cov=src/ravenpy --cov-report=lcov
Expand All @@ -207,11 +218,7 @@ jobs:
uses: step-security/harden-runner@6c439dc8bdf85cadbbce9ed30d1c7b959517bc49 # v2.12.2
with:
disable-sudo: true
egress-policy: block
allowed-endpoints: >
coveralls.io:443
github.com:443
objects.githubusercontent.com:443
egress-policy: audit
- name: Coveralls Finished
uses: coverallsapp/github-action@648a8eb78e6d50909eff900e4ec85cab4524a45b # v2.3.6
with:
Expand Down
95 changes: 95 additions & 0 deletions .github/workflows/testdata-version.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
name: Verify Testing Data

on:
pull_request:
types:
- opened
- reopened
- synchronize
paths:
- .github/workflows/main.yml

permissions:
contents: read

jobs:
use-latest-tag:
name: Check Latest raven-testdata Tag
runs-on: ubuntu-latest
if: |
(github.event.pull_request.head.repo.full_name == github.event.pull_request.base.repo.full_name)
permissions:
pull-requests: write
steps:
- name: Harden Runner
uses: step-security/harden-runner@0634a2670c59f64b4a01f0f96f84700a4088b9f0 # v2.12.0
with:
disable-sudo: true
egress-policy: block
allowed-endpoints: >
api.github.com:443
github.com:443
- name: Checkout Repository
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
persist-credentials: false
- name: Find raven-testdata Tag and CI Testing Branch
run: |
RAVEN_TESTDATA_TAG="$( \
git -c 'versionsort.suffix=-' \
ls-remote --exit-code --refs --sort='version:refname' --tags https://github.com/Ouranosinc/raven-testdata '*.*.*' \
| tail --lines=1 \
| cut --delimiter='/' --fields=3)"
echo "RAVEN_TESTDATA_TAG=${RAVEN_TESTDATA_TAG}" >> $GITHUB_ENV
RAVEN_TESTDATA_BRANCH="$(grep -E "RAVEN_TESTDATA_BRANCH" .github/workflows/main.yml | cut -d ' ' -f4)"
echo "RAVEN_TESTDATA_BRANCH=${RAVEN_TESTDATA_BRANCH}" >> $GITHUB_ENV
- name: Report Versions Found
run: |
echo "Latest raven-testdata tag: ${RAVEN_TESTDATA_TAG}"
echo "Tag for raven-testdata in CI: ${RAVEN_TESTDATA_BRANCH}"
env:
RAVEN_TESTDATA_TAG: ${{ env.RAVEN_TESTDATA_TAG }}
RAVEN_TESTDATA_BRANCH: ${{ env.RAVEN_TESTDATA_BRANCH }}
- name: Find Comment
uses: peter-evans/find-comment@3eae4d37986fb5a8592848f6a574fdf654e61f9e # v3.1.0
id: fc
with:
issue-number: ${{ github.event.pull_request.number }}
comment-author: 'github-actions[bot]'
body-includes: It appears that this Pull Request modifies the `main.yml` workflow.
- name: Compare Versions
if: ${{( env.RAVEN_TESTDATA_TAG != env.RAVEN_TESTDATA_BRANCH )}}
uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea # v7.0.1
with:
script: |
core.setFailed('Configured `raven-testdata` tag is not `latest`.')
- name: Update Failure Comment
if: ${{ failure() }}
uses: peter-evans/create-or-update-comment@71345be0265236311c031f5c7866368bd1eff043 # v4.0.0
with:
comment-id: ${{ steps.fc.outputs.comment-id }}
issue-number: ${{ github.event.pull_request.number }}
body: |
> [!WARNING]
> It appears that this Pull Request modifies the `main.yml` workflow.

On inspection, it seems that the `RAVEN_TESTDATA_BRANCH` environment variable is set to a tag that is not the latest in the `Ouranosinc/raven-testdata` repository.

This value must match the most recent tag (`${{ env.RAVEN_TESTDATA_TAG }}`) in order to merge this Pull Request.

If this PR depends on changes in a new testing dataset branch, be sure to tag a new version of `Ouranosinc/raven-testdata` once your changes have been merged to its `main` branch.
edit-mode: replace
- name: Update Success Comment
if: ${{ success() }}
uses: peter-evans/create-or-update-comment@71345be0265236311c031f5c7866368bd1eff043 # v4.0.0
with:
comment-id: ${{ steps.fc.outputs.comment-id }}
issue-number: ${{ github.event.pull_request.number }}
body: |
> [!NOTE]
> It appears that this Pull Request modifies the `main.yml` workflow.

On inspection, the `RAVEN_TESTDATA_BRANCH` environment variable is set to the most recent tag (`${{ env.RAVEN_TESTDATA_TAG }}`).

No further action is required.
edit-mode: replace
7 changes: 4 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ repos:
hooks:
- id: pyupgrade
args: [ '--py39-plus' ]
exclude: ^tests/conftest\.py$
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
Expand Down Expand Up @@ -52,7 +53,7 @@ repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.12.2
hooks:
- id: ruff
- id: ruff-check
args: [ '--fix' ]
# - id: ruff-format
- repo: https://github.com/pycqa/flake8
Expand All @@ -74,11 +75,11 @@ repos:
hooks:
- id: nbqa-pyupgrade
args: [ '--py39-plus' ]
additional_dependencies: [ 'pyupgrade==3.19.1' ]
additional_dependencies: [ 'pyupgrade==3.20.0' ]
- id: nbqa-black
additional_dependencies: [ 'black==25.1.0' ]
- id: nbqa-isort
additional_dependencies: [ 'isort==6.0.0' ]
additional_dependencies: [ 'isort==6.0.1' ]
- repo: https://github.com/kynan/nbstripout
rev: 0.8.1
hooks:
Expand Down
11 changes: 10 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,15 @@ v0.18.3 (unreleased)

New features
^^^^^^^^^^^^
* Added `parsers.parse_rv` to extract a Command value from an RV file.
* Added `parsers.parse_rv` to extract a Command value from an RV file. (PR #503)
* New module `ravenpy.testing` has been added to provide utility functions and support for testing and testing data management. (PR #513)

Breaking changes
^^^^^^^^^^^^^^^^
* `ravenpy` now requires `pooch>=1.8.0` for downloading and caching remote testing data. (PR #513)
* `ravenpy.utilities.testdata` has been refactored to new module `ravenpy.testing`. The `publish_release_notes` function is now located in `ravenpy.utilities.publishing`. (PR #513)
* The `ravenpy.testing.utils` module now provides a `yangtze()` class for fetching and caching the `raven-testdata` testing data. A convenience function (`get_file`) replaces the previous `get_local_testdata`. (PR #513)
* The `ravenpy.testing.utils.open_dataset` function no longer supports OPeNDAP URLs or local file paths. Instead, it uses the `yangtze()` class to fetch datasets from the testing data repository or the local cache. Users should now use `xarray.open_dataset()` directly for OPeNDAP URLs or local files. (PR #513)

Bug fixes
^^^^^^^^^
Expand All @@ -18,6 +26,7 @@ Bug fixes
Internal changes
^^^^^^^^^^^^^^^^
* `ravenpy` now requires `xclim>=0.57.0` and `xsdba` (v0.4.0+). (PR #511)
* The `tests` folder no longer contains an `__init__.py` file and is no longer treated as a package. `pytest` fixtures from `emulators.py` are now directly imported into `conftest.py` for use in tests, and existing `pytest` fixtures have been modified to use the new `yangtze()` class for fetching testing data. (PR #513)

v0.18.2 (2025-05-05)
--------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,10 @@
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"import rasterio\n",
"import rioxarray as rio\n",
"from birdy import WPSClient\n",
"\n",
"from ravenpy.utilities.testdata import get_file\n",
"# Utility that simplifies working with test data hosted on GitHub\n",
"from ravenpy.testing.utils import get_file\n",
"\n",
"# This is the URL of the Geoserver that will perform the computations for us.\n",
"url = os.environ.get(\n",
Expand Down Expand Up @@ -74,8 +74,7 @@
"\"\"\"\n",
"feature_url = \"input.geojson\"\n",
"\"\"\"\n",
"# However, to keep things tidy, we have also prepared a version that can be accessed easily for\n",
"# demonstration purposes:\n",
"# However, to keep things tidy, we have also prepared a version that can be accessed easily for demonstration purposes:\n",
"feature_url = get_file(\"notebook_inputs/input.geojson\")\n",
"df = gpd.read_file(feature_url)\n",
"display(df)\n",
Expand Down
3 changes: 2 additions & 1 deletion docs/notebooks/03_Extracting_forcing_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@
"import xarray as xr\n",
"from clisops.core import subset\n",
"\n",
"from ravenpy.utilities.testdata import get_file"
"# Utility that simplifies working with test data hosted on GitHub\n",
"from ravenpy.testing.utils import get_file"
]
},
{
Expand Down
4 changes: 3 additions & 1 deletion docs/notebooks/04_Emulating_hydrological_models.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,9 @@
"from pathlib import Path\n",
"\n",
"from ravenpy.config import commands as rc\n",
"from ravenpy.utilities.testdata import get_file"
"\n",
"# Utility that simplifies fetching and caching test data hosted on GitHub\n",
"from ravenpy.testing.utils import get_file"
]
},
{
Expand Down
6 changes: 3 additions & 3 deletions docs/notebooks/05_Advanced_RavenPy_configuration.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@
"metadata": {},
"outputs": [],
"source": [
"# Utility that simplifies getting data hosted on the remote PAVICS-Hydro data server.\n",
"from ravenpy.utilities.testdata import get_file"
"# Utility that simplifies fetching and caching data hosted on GitHub\n",
"from ravenpy.testing.utils import get_file"
]
},
{
Expand Down Expand Up @@ -216,7 +216,7 @@
"\n",
"# Observed weather data for the Salmon river. We extracted this using Tutorial Notebook 03 and the\n",
"# salmon_river.geojson file as the contour.\n",
"ts = get_file(\"notebook_inputs/ERA5_weather_data_Salmon.nc\")\n",
"ts = yangtze.fetch(\"notebook_inputs/ERA5_weather_data_Salmon.nc\")\n",
"\n",
"# Set alternate variable names in the timeseries data file\n",
"alt_names = {\n",
Expand Down
7 changes: 4 additions & 3 deletions docs/notebooks/06_Raven_calibration.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@
"\n",
"from ravenpy.config import commands as rc\n",
"from ravenpy.config import emulators\n",
"\n",
"# Utility that simplifies working with test data hosted on GitHub\n",
"from ravenpy.testing.utils import get_file\n",
"from ravenpy.utilities.calibration import SpotSetup"
]
},
Expand All @@ -52,9 +55,7 @@
"metadata": {},
"outputs": [],
"source": [
"from ravenpy.utilities.testdata import get_file\n",
"\n",
"# We get the netCDF for testing on a server. You can replace the getfile method by a string containing the path to your own netCDF\n",
"# We get the netCDF for testing on a server. You can replace the yangtze method with a string containing the absolute or relative path to your own netCDF\n",
"nc_file = get_file(\n",
" \"raven-gr4j-cemaneige/Salmon-River-Near-Prince-George_meteo_daily.nc\"\n",
")\n",
Expand Down
4 changes: 3 additions & 1 deletion docs/notebooks/07_Making_and_using_hotstart_files.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,9 @@
"# Import the GR4JCN model\n",
"from ravenpy.config import commands as rc\n",
"from ravenpy.config import emulators\n",
"from ravenpy.utilities.testdata import get_file"
"\n",
"# Utility that simplifies working with test data hosted on GitHub\n",
"from ravenpy.testing.utils import get_file"
]
},
{
Expand Down
Loading
Loading