Skip to content

Bug: ValidationError crashes with AttributeError due to incorrect variable reference in validate.py #45

@ram-from-tvl

Description

@ram-from-tvl

The validate() function in validate.py contains a critical error message bug that causes the function to crash with an AttributeError instead of providing a helpful validation error when a dataset doesn't have the required dimensions.

To Reproduce

Steps to reproduce the behavior:

  1. Create or use a zarr dataset that lacks the required x_geostationary and y_geostationary dimensions
  2. Call the validate() function on this dataset: validate(src="path/to/invalid/dataset.zarr")
  3. The function attempts to validate the dimensions
  4. See error: AttributeError: 'DataArray' object has no attribute 'data_vars'

Expected behavior

The function should raise a clear ValidationError with a properly formatted message showing:

  • The actual file path being validated
  • The expected dimensions that are missing
  • The actual dimensions found in the dataset

Additional context

Location: validate.py, lines 51-56

Root cause: The error message has three issues:

  1. Missing f prefix for string formatting ({src} prints literally instead of the variable value)
  2. Wrong variable reference (ds.data_vars['data'].dims should be da.dims)
  3. Incorrect attribute access (da is a DataArray, not a Dataset, so it doesn't have data_vars)

Current problematic code:

raise ValidationError(
    "Cannot validate dataset at path {src}. "
    "Expected dimensions ['x_geostationary', 'y_geostationary'] not present. "
    "Got: {list(ds.data_vars['data'].dims)}",
)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions