-
Notifications
You must be signed in to change notification settings - Fork 0
Home
This Wiki is for documenting the Kerchunk Study. We studied the Kerchunk using a few sample NASA Earthdata HDF5 files. We also studied the feasibility of converting Kerchunk to and from OPeNDAP Hyrax DMR++.
Kerchunk is a derived work based on RFC 7233 (2014), HDF4 File Content Map Writer (2016), and Cloudydap (2017).
Thus, it has many similarities with OPeNDAP Hyrax DMR++ that the Cloudydap project produced. Both rely on HDF5 API calls to get offset/length information. Kerchunk obtains such information through high-level h5py Python calls.
Although the basic idea is same, there are a few differences between them. The following table summarizes key differences.
Workflow | Kerchunk | OPeNDAP |
---|---|---|
source | HDF5/netCDF/grib/fits | HDF5 |
language | Python | C/C++ |
conversion | zarr | dmr++ |
output | json | xml |
aggregation | fsspec+multizarr API | NcML |
subchunk | Yes | n/a |
The Kerchunk development is still active. It has some issues with NASA HDF5/netCDF-4 data products. See Kerchunk for the details.
Reading NASA data through (nc)Zarr/xarray also has some interoperability issues. For example, an xarray-based DataTree reports an error in reading a NASA HDF5 data product. See DataTree for the details. Unidata ncZarr can't read Kerchunk file.
The dmrpp_module can serve NASA HDF5 data products robustly. However, pydap client has some issues. See DMR++ for the details.
We studied the feasibility of converting Kerchunk to DMR++. See DMR++ to Kerchunk for the details.
We also studied the feasibility of converting Kerchunk to DMR++. See Kerchunk to DMR++ for the details.