Skip to content

Commit f424ad6

Browse files
Merge branch 'main' into to-iris
2 parents 3b961d6 + b9c14d9 commit f424ad6

24 files changed

+943
-77
lines changed

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,6 @@ Here's a quick checklist in what to include:
1717
- [ ] Include a detailed description of the bug or suggestion
1818
- [ ] Output of `intake_esm.show_versions()`
1919
- [ ] Minimal, self-contained copy-pastable example that generates the issue if possible. Please be concise with code posted. See guidelines below on how to provide a good bug report:
20-
2120
- [Minimal Complete Verifiable Examples](https://stackoverflow.com/help/mcve)
2221
- [Craft Minimal Bug Reports](http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports)
2322

.github/workflows/ci.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ jobs:
5252
python -m pytest
5353
5454
- name: Upload code coverage to Codecov
55-
uses: codecov/codecov-action@v5.4.2
55+
uses: codecov/codecov-action@v5.4.3
5656
with:
5757
file: ./coverage.xml
5858
flags: unittests

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ repos:
1515
- id: mixed-line-ending
1616

1717
- repo: https://github.com/astral-sh/ruff-pre-commit
18-
rev: "v0.11.8"
18+
rev: "v0.12.7"
1919
hooks:
2020
- id: ruff
2121
args: ["--fix"]

CHANGELOG.md

Lines changed: 75 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,80 @@
11
# Changelog
22

3+
[Full Changelog](https://github.com/intake/intake-esm/compare/v2025.2.3...v2025.7.9)
4+
5+
## v2025.7.9
6+
7+
### New features added
8+
9+
- Support Python313, add setuptools to requirements by @Zeitsperre in https://github.com/intake/intake-esm/pull/707
10+
- Load with polars by @charles-turner-1 in https://github.com/intake/intake-esm/pull/709
11+
- Add interactive view of catalog by @charles-turner-1 in https://github.com/intake/intake-esm/pull/723
12+
13+
### Bugs fixed
14+
15+
- Fixed bug where pyarrow conversions were causing string accessor to fail in search by @charles-turner-1 in https://github.com/intake/intake-esm/pull/718
16+
17+
### Maintenance and upkeep improvements
18+
19+
- Update default argument for decode_timedelta by @charles-turner-1 in https://github.com/intake/intake-esm/pull/706
20+
- 697- Fix segfault by @charles-turner-1 in https://github.com/intake/intake-esm/pull/712
21+
- Fix broken `require_all_on` example in docs by @charles-turner-1 in https://github.com/intake/intake-esm/pull/720
22+
- Improve test performance by @charles-turner-1 in https://github.com/intake/intake-esm/pull/719
23+
24+
### Other merged PRs
25+
26+
- Bump codecov/codecov-action from 5.3.1 to 5.4.0 in the actions group by @dependabot in https://github.com/intake/intake-esm/pull/703
27+
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/intake/intake-esm/pull/704
28+
- Remove setuptools runtime requirements by @Zeitsperre in https://github.com/intake/intake-esm/pull/708
29+
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/intake/intake-esm/pull/713
30+
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/intake/intake-esm/pull/716
31+
- Bump codecov/codecov-action from 5.4.0 to 5.4.2 in the actions group by @dependabot in https://github.com/intake/intake-esm/pull/714
32+
- Bump codecov/codecov-action from 5.4.2 to 5.4.3 in the actions group by @dependabot in https://github.com/intake/intake-esm/pull/721
33+
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/intake/intake-esm/pull/722
34+
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/intake/intake-esm/pull/726
35+
36+
### New Contributors
37+
38+
- @Zeitsperre made their first contribution in https://github.com/intake/intake-esm/pull/707
39+
40+
## v2025.2.3
41+
42+
[Full Changelog](https://github.com/intake/intake-esm/compare/v2024.2.6...v2025.2.3)
43+
44+
### New features added
45+
46+
- Changed behaviour of `source._open_dataset` to include coordinate variables by @charles-turner-1 in https://github.com/intake/intake-esm/pull/681
47+
- feat: Support being able to use in-memory ESMCatalogModel instances by @lewisjared in https://github.com/intake/intake-esm/pull/690
48+
49+
### Bugs fixed
50+
51+
- Fix #684 Merge results from main and derived catalogs by @rbeucher in https://github.com/intake/intake-esm/pull/685
52+
- Fix how storage_options is passed to get_mapper by @garciampred in https://github.com/intake/intake-esm/pull/678
53+
54+
### Maintenance and upkeep improvements
55+
56+
- Upgrade to intake Take2 by @charles-turner-1 in https://github.com/intake/intake-esm/pull/683
57+
- Update dependencies and improve import handling for xarray version compatibility by @andersy005 in https://github.com/intake/intake-esm/pull/696
58+
- Update pyproject.toml and readthedocs.yml by @andersy005 in https://github.com/intake/intake-esm/pull/695
59+
- Add flaky decorator for tests with remote resources to reduce cold start failure rate by @charles-turner-1 in https://github.com/intake/intake-esm/pull/700
60+
- Update CONTRIBUTING guidelines to e.g. refer to new default branch by @sadielbartholomew in https://github.com/intake/intake-esm/pull/674
61+
62+
### Other merged PRs
63+
64+
- Bump the actions group with 16 updates by @dependabot in https://github.com/intake/intake-esm/pull/658, https://github.com/intake/intake-esm/pull/654, https://github.com/intake/intake-esm/pull/661, https://github.com/intake/intake-esm/pull/671, https://github.com/intake/intake-esm/pull/686, https://github.com/intake/intake-esm/pull/691, https://github.com/intake/intake-esm/pull/693, https://github.com/intake/intake-esm/pull/698
65+
- Update PyPI workflow to build and upload intake-esm artifacts by @andersy005 in https://github.com/intake/intake-esm/pull/659
66+
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in https://github.com/intake/intake-esm/pull/662, https://github.com/intake/intake-esm/pull/664, https://github.com/intake/intake-esm/pull/669, https://github.com/intake/intake-esm/pull/672, https://github.com/intake/intake-esm/pull/675, https://github.com/intake/intake-esm/pull/677, https://github.com/intake/intake-esm/pull/682, https://github.com/intake/intake-esm/pull/687, https://github.com/intake/intake-esm/pull/692, https://github.com/intake/intake-esm/pull/694
67+
- Bump codecov/codecov-action from 4.1.1 to 4.4.1 in the actions group by @dependabot in https://github.com/intake/intake-esm/pull/663,https://github.com/intake/intake-esm/pull/668
68+
- Bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.2 in the actions group by @dependabot in https://github.com/intake/intake-esm/pull/676, https://github.com/intake/intake-esm/pull/680
69+
70+
### New Contributors
71+
72+
- @sadielbartholomew made their first contribution in https://github.com/intake/intake-esm/pull/674
73+
- @garciampred made their first contribution in https://github.com/intake/intake-esm/pull/678
74+
- @charles-turner-1 made their first contribution in https://github.com/intake/intake-esm/pull/681
75+
- @rbeucher made their first contribution in https://github.com/intake/intake-esm/pull/685
76+
- @lewisjared made their first contribution in https://github.com/intake/intake-esm/pull/690
77+
378
## v2024.2.6
479

580
([full changelog](https://github.com/intake/intake-esm/compare/v2023.10.27...d96efea14b348e346d5c2a1490e91c9ae1e2c709))
@@ -684,7 +759,6 @@
684759

685760
- Fix CESM-LE ice component peculiarities that caused intake-esm to load data improperly.
686761
The fix separates variables for `ice` component into two separate components:
687-
688762
- `ice_sh`: for southern hemisphere
689763
- `ice_nh`: for northern hemisphere
690764

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ providing necessary functionality for searching, discovering, data access/loadin
6464
...: )
6565

6666
In [6]: cat_subset
67-
Out[6]: <GOOGLE-CMIP6 catalog with 4 dataset(s) from 261 asset(s)>
67+
Out[6]: <GOOGLE-CMIP6 catalog with 2 dataset(s) from 67 asset(s)>
6868
```
6969

7070
- Access: when the user is satisfied with the results of their query, they can load data assets (netCDF and/or Zarr stores) into xarray datasets:

ci/environment-docs.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ dependencies:
1010
- fsspec >=2024.12
1111
- gcsfs >=2024.12
1212
- intake >=2.0
13+
- itables
1314
- jupyterlab
1415
- matplotlib
1516
- myst-nb
@@ -29,7 +30,7 @@ dependencies:
2930
- watermark
3031
- xarray-datatree >=0.0.9
3132
- xarray >=2024.10
32-
- zarr >=2.12,<3.0
33+
- zarr <3.0|>=3.0.10
3334
- furo >=2022.09.15
3435
- pip:
3536
- git+https://github.com/ncar-xdev/ecgtools

ci/environment-upstream-dev.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ dependencies:
1212
- gcsfs >=2024.12
1313
- h5netcdf >=0.8.1
1414
- ipython
15+
- itables
1516
- matplotlib
1617
- netcdf4 >=1.5.5,!=1.6.1
1718
- pandas >=2.1.0
@@ -36,7 +37,7 @@ dependencies:
3637
- scipy
3738
- xarray-datatree
3839
- xgcm
39-
- zarr >=2.10,<3.0
40+
- zarr <3.0|>=3.0.10
4041
- pip:
4142
- git+https://github.com/intake/intake.git
4243
- git+https://github.com/pydata/xarray.git

ci/environment.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ dependencies:
1414
- h5netcdf >=0.8.1
1515
- intake >=2.0
1616
- ipython
17+
- itables
1718
- matplotlib
1819
- netcdf4 >=1.5.5,!=1.6.1
1920
- pandas >=2.1.0
@@ -34,5 +35,5 @@ dependencies:
3435
- scipy
3536
- xarray >=2024.10
3637
- xarray-datatree
37-
- zarr >=2.12,<3.0
38+
- zarr <3.0|>=3.0.10
3839
# - pytest-icdiff

docs/source/how-to/use-catalogs-with-assets-containing-multiple-variables.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ kernelspec:
1212
By default, `intake-esm` assumes that the data assets (files) contain a single variable (e.g. `temperature`, `precipitation`, etc..). If you have multiple variables in your data files, intake-esm requires the following:
1313

1414
- the `variable_column` of the catalog must contain iterables (list, tuple, set) of values (e.g. `['temperature', 'precipitation']`).
15-
- the user must provide converters with appropriate functions for parsing values in the `variable_column` (and/or any other column with iterables) into iterables when loading the catalog. There are two ways to do this with the `open_esm_datastore` function: either pass the converter functions directly through the `read_csv_kwargs` argument, or specify the columns in `columns_with_iterables` parameter. The latter is a shortcut for the former. Both are demonstrated below.
15+
- the user must provide converters with appropriate functions for parsing values in the `variable_column` (and/or any other column with iterables) into iterables when loading the catalog. There are two ways to do this with the `open_esm_datastore` function: either pass the converter functions directly through the `read_kwargs` argument, or specify the columns in `columns_with_iterables` parameter. The latter is a shortcut for the former. Both are demonstrated below.
1616

1717
## Inspect the catalog
1818

@@ -40,7 +40,7 @@ dask.config.set(scheduler='single-threaded')
4040
4141
cat = intake.open_esm_datastore(
4242
"multi-variable-catalog.json",
43-
read_csv_kwargs={"converters": {"variable": ast.literal_eval}},
43+
read_kwargs={"converters": {"variable": ast.literal_eval}},
4444
)
4545
cat
4646
```
@@ -63,13 +63,21 @@ cat.esmcat.has_multiple_variable_assets
6363

6464
## Search for datasets
6565

66-
The search functionatilty works in the same way:
66+
The search functionality works in the same way:
6767

6868
```{code-cell} ipython3
6969
cat_subset =cat.search(variable=["O2", "SiO3"])
7070
cat_subset.df
7171
```
7272

73+
### Interactively search the catalog
74+
75+
We can also use the `interactive` attribute of a catalog to interactively search the catalog. This will not save any searches, but allows you to explore the catalog in a quick and intuitive way.
76+
77+
```{code-cell} ipython3
78+
cat.interactive
79+
```
80+
7381
## Load assets into xarray datasets
7482

7583
When loading the data files into xarray datasets, `intake-esm` will load only **data variables** that were requested. For example, if a data file contains ten data variables and the user requests for two variables, intake-esm will load the two requested variables plus necessary coordinates information.
@@ -81,7 +89,7 @@ dsets
8189

8290
## Why does `intake.open_esm_datastore` need the `columns_with_iterables` parameter?
8391

84-
Why does intake `intake.open_esm_datastore` need the `columns_with_iterables` argument when we can achieve the same functionality with just `read_csv_kwargs`? Intake facilitates writing YAML descriptions of catalogs that can be opened with `intake.open_catalog`. These YAML descriptions include the information required to open the catalog: things like the catalog driver (`intake_esm.core.esm_datastore` in our case) and the arguments to pass to the driver to open the catalog. They can be included as entries in other catalogs enabling features like [catalog nesting](https://intake.readthedocs.io/en/latest/catalog.html#catalog-nesting). However, intake does not support Python function arguments like those we provided to `read_csv_kwargs` above so if we want a functional intake YAML description of an intake-esm catalog with multi-variable assets we need to use the `columns_with_iterables` argument instead. You can return an intake YAML description of an `esm_datastore` as follows:
92+
Why does intake `intake.open_esm_datastore` need the `columns_with_iterables` argument when we can achieve the same functionality with just `read_kwargs`? Intake facilitates writing YAML descriptions of catalogs that can be opened with `intake.open_catalog`. These YAML descriptions include the information required to open the catalog: things like the catalog driver (`intake_esm.core.esm_datastore` in our case) and the arguments to pass to the driver to open the catalog. They can be included as entries in other catalogs enabling features like [catalog nesting](https://intake.readthedocs.io/en/latest/catalog.html#catalog-nesting). However, intake does not support Python function arguments like those we provided to `read_kwargs` above so if we want a functional intake YAML description of an intake-esm catalog with multi-variable assets we need to use the `columns_with_iterables` argument instead. You can return an intake YAML description of an `esm_datastore` as follows:
8593

8694
```{code-cell} ipython3
8795
cat.name = "my-esm-catalog"

docs/source/reference/esm-catalog-spec.md

Lines changed: 38 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -85,11 +85,44 @@ The column names can optionally be associated with a controlled vocabulary, such
8585

8686
An assets object describes the columns in the CSV file relevant for opening the actual data files.
8787

88-
| Element | Type | Description |
89-
| ------------------ | ------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
90-
| column_name | string | **REQUIRED.** The name of the column containing the path to the asset. Must be in the header of the CSV file. |
91-
| format | string | The data format. Valid values are `netcdf`, `zarr`, `opendap` or `reference` ([`kerchunk`](https://github.com/fsspec/kerchunk) reference files). If specified, it means that all data in the catalog is the same type. |
92-
| format_column_name | string | The column name which contains the data format, allowing for variable data types in one catalog. Mutually exclusive with `format`. |
88+
| Element | Type | Description |
89+
| ------------------ | ------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
90+
| column_name | string | **REQUIRED.** The name of the column containing the path to the asset. Must be in the header of the CSV file. |
91+
| format | string | The data format. Valid values are `netcdf`, `zarr`, `zarr2`, `zarr3`, `opendap` or `reference` ([`kerchunk`](https://github.com/fsspec/kerchunk) reference files). If specified, it means that all data in the catalog is the same type. |
92+
| format_column_name | string | The column name which contains the data format, allowing for variable data types in one catalog. Mutually exclusive with `format`. |
93+
94+
````{note}
95+
Zarr v3 is built on asynchronous operations, and requires `xarray_open_kwargs` to contain the following dictionary fragment:
96+
```python
97+
xarray_open_kwargs = {
98+
"storage_options" : {
99+
"remote_options" : {
100+
"async": true,
101+
...
102+
},
103+
...
104+
},
105+
...
106+
}
107+
```
108+
109+
In contrast, Zarr v2 is synchronous and instead requires:
110+
111+
```python
112+
xarray_open_kwargs = {
113+
"storage_options" : {
114+
"remote_options" : {
115+
"async": false,
116+
...
117+
},
118+
...
119+
},
120+
...
121+
}
122+
```
123+
````
124+
125+
If `zarr2` or `zarr3` is specified in the `format` field, the `async` flag will be set automatically. If you specify `zarr` as the format, you must set the `async` flag manually in the `xarray_open_kwargs`.
93126

94127
### Aggregation Control Object
95128

0 commit comments

Comments
 (0)