Skip to content

Commit 31985a7

Browse files
Update ESM Catalog Specification
1 parent 448ea68 commit 31985a7

File tree

1 file changed

+17
-17
lines changed

1 file changed

+17
-17
lines changed

docs/source/reference/esm-catalog-spec.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,8 @@ The descriptor is a single json file, inspired by the [STAC spec](https://github
4343
### Catalog
4444

4545
The collection points to a single catalog.
46-
A catalog is a CSV file.
47-
The meaning of the columns in the csv file is defined by the parent collection.
46+
A catalog is a CSV or parquet file.
47+
The meaning of the columns in the csv/parquet file is defined by the parent collection.
4848

4949
```
5050
activity_id,source_id,path
@@ -65,29 +65,29 @@ They should be either [URIs](https://en.wikipedia.org/wiki/Uniform_Resource_Iden
6565
| id | string | **REQUIRED.** Identifier for the catalog. |
6666
| title | string | A short descriptive one-line title for the catalog. |
6767
| description | string | **REQUIRED.** Detailed multi-line description to fully explain the catalog. [CommonMark 0.28](http://commonmark.org/) syntax MAY be used for rich text representation. |
68-
| catalog_file | string | **REQUIRED.** Path to a the CSV file with the catalog contents. |
69-
| catalog_dict | array | If specified, it is mutually exclusive with `catalog_file`. An array of dictionaries that represents the data that would otherwise be in the csv. |
68+
| catalog_file | string | **REQUIRED.** Path to a the CSV/parquet file with the catalog contents. |
69+
| catalog_dict | array | If specified, it is mutually exclusive with `catalog_file`. An array of dictionaries that represents the data that would otherwise be in the csv/parquet. |
7070
| attributes | [[Attribute Object](#attribute-object)] | **REQUIRED.** A list of attribute columns in the data set. |
71-
| assets | [Assets Object](#assets-object) | **REQUIRED.** Description of how the assets (data files) are referenced in the CSV catalog file. |
71+
| assets | [Assets Object](#assets-object) | **REQUIRED.** Description of how the assets (data files) are referenced in the CSV/parquet catalog file. |
7272
| aggregation_control | [Aggregation Control Object](#aggregation-control-object) | **OPTIONAL.** Description of how to support aggregation of multiple assets into a single xarray data set. |
7373

7474
### Attribute Object
7575

76-
An attribute object describes a column in the catalog CSV file.
76+
An attribute object describes a column in the catalog CSV/parquet file.
7777
The column names can optionally be associated with a controlled vocabulary, such as the [CMIP6 CVs](https://github.com/WCRP-CMIP/CMIP6_CVs), which explain how to interpret the attribute values.
7878

79-
| Element | Type | Description |
80-
| ----------- | ------ | -------------------------------------------------------------------------------------- |
81-
| column_name | string | **REQUIRED.** The name of the attribute column. Must be in the header of the CSV file. |
82-
| vocabulary | string | Link to the controlled vocabulary for the attribute in the format of a URL. |
79+
| Element | Type | Description |
80+
| ----------- | ------ | ---------------------------------------------------------------------------------------------- |
81+
| column_name | string | **REQUIRED.** The name of the attribute column. Must be in the header of the CSV/parquet file. |
82+
| vocabulary | string | Link to the controlled vocabulary for the attribute in the format of a URL. |
8383

8484
### Assets Object
8585

86-
An assets object describes the columns in the CSV file relevant for opening the actual data files.
86+
An assets object describes the columns in the CSV/parquet file relevant for opening the actual data files.
8787

8888
| Element | Type | Description |
8989
| ------------------ | ------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
90-
| column_name | string | **REQUIRED.** The name of the column containing the path to the asset. Must be in the header of the CSV file. |
90+
| column_name | string | **REQUIRED.** The name of the column containing the path to the asset. Must be in the header of the CSV/parquet file. |
9191
| format | string | The data format. Valid values are `netcdf`, `zarr`, `zarr2`, `zarr3`, `opendap` or `reference` ([`kerchunk`](https://github.com/fsspec/kerchunk) reference files). If specified, it means that all data in the catalog is the same type. |
9292
| format_column_name | string | The column name which contains the data format, allowing for variable data types in one catalog. Mutually exclusive with `format`. |
9393

@@ -128,11 +128,11 @@ If `zarr2` or `zarr3` is specified in the `format` field, the `async` flag will
128128

129129
An aggregation control object defines neccessary information to use when aggregating multiple assets into a single xarray data set.
130130

131-
| Element | Type | Description |
132-
| -------------------- | ------------------------------------------- | --------------------------------------------------------------------------------------- |
133-
| variable_column_name | string | **REQUIRED.** Name of the attribute column in csv file that contains the variable name. |
134-
| groupby_attrs | array | Column names (attributes) that define data sets that can be aggegrated. |
135-
| aggregations | [[Aggregation Object](#aggregation-object)] | **OPTIONAL.** List of aggregations to apply to query results |
131+
| Element | Type | Description |
132+
| -------------------- | ------------------------------------------- | ----------------------------------------------------------------------------------------------- |
133+
| variable_column_name | string | **REQUIRED.** Name of the attribute column in csv/parquet file that contains the variable name. |
134+
| groupby_attrs | array | Column names (attributes) that define data sets that can be aggegrated. |
135+
| aggregations | [[Aggregation Object](#aggregation-object)] | **OPTIONAL.** List of aggregations to apply to query results |
136136

137137
### Aggregation Object
138138

0 commit comments

Comments
 (0)