Skip to content

Chunked array leads to hole in data #238

@saschahofmann

Description

@saschahofmann

I encountered an issue where a certain combination of chunksize and final resolution is creating a different output than loaded data or a different chunking schema. I struggled to create minimal example but together with the attached netcdf the problem should be reproducible by running this:

from odc.geo.geobox import GeoBox
from odc.geo.xr import xr_reproject
from pyproj import CRS
from matplotlib import pyplot as plt

CORDEX_GRID = GeoBox.from_bbox(
    (-44.6, 21.9, 65, 72.6), CRS.from_epsg(4326), resolution=0.1
)

def attrs_to_crs(attrs: dict):
    """Convert GRIB attrs to proj4 CRS."""
    latitude_of_projection_origin = attrs["GRIB_LaDInDegrees"]
    standard_parallel = attrs["GRIB_LaDInDegrees"]
    longitude_of_central_meridian = attrs["GRIB_LoVInDegrees"]
    return CRS.from_proj4(
        f"+proj=lcc +lat_1={standard_parallel} +lat_0={latitude_of_projection_origin} +lon_0={longitude_of_central_meridian} +x_0=0 +y_0=0"
    )

def add_x_y(da: xr.DataArray):
    """Adds x-y coordinates in meters and centered at 0,0"""
    dx = float(da.attrs["GRIB_DxInMetres"])
    dy = float(da.attrs["GRIB_DyInMetres"])
    return da.assign_coords(
        x=(da.x - da.x.max() / 2) * dx, y=(da.y - da.y.max() / 2) * dy
    )

def regrid_cerra(da: xr.DataArray):
    da = add_x_y(da.assign_coords(longitude=((da.longitude + 180) % 360) - 180))
    da = da.odc.assign_crs(attrs_to_crs(da.attrs))

    return xr_reproject(
        da, CORDEX_GRID, resampling='bilinear', dst_nodata=None
    )

da = xr.open_dataset('pr.nc').pr
fig, (ax1, ax2) = plt.subplots(1,2, figsize=(20,8))

regrid_cerra(da).plot(ax=ax1)
regrid_cerra(da.chunk(x=256, y=256)).plot(ax=ax2)

The output should look like this:

Image

You can see that the right plot contains a small hole of NaNs. Other combinations of chunk sizes dont produce this and the in-memory example on the left also doesnt have this artifact.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions