Skip to content

Usage of ancient files for testing breaks with recent h5py/libhdf5 #108

@kmuehlbauer

Description

@kmuehlbauer

While working on #102 @bnlawrence and I found a glitch when we recreated test files with recent h5py/libhdf5.

See #102 (comment) for details.

Citing here for visibility:

If we recreate the file now, the following line of code:

attrs['vlen_str_array'] = [b'Hello', b'World!']

translates to this HDF5 dump:

ATTRIBUTE "vlen_str_array" {
   DATATYPE  H5T_STRING {
      STRSIZE H5T_VARIABLE;
      STRPAD H5T_STR_NULLTERM;
      CSET H5T_CSET_ASCII;
      CTYPE H5T_C_S1;
   }
   DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
   DATA {
   (0): "Hello", "World!"
   }
}

whereas the old file translates to:

ATTRIBUTE "vlen_str_array" {
   DATATYPE  H5T_STRING {
      STRSIZE 6;
      STRPAD H5T_STR_NULLPAD;
      CSET H5T_CSET_ASCII;
      CTYPE H5T_C_S1;
   }
   DATASPACE  SIMPLE { ( 2 ) / ( 2 ) }
   DATA {
   (0): "Hello\000", "World!"
   }
}

It might make sense to move from the current behaviour of testing already created files (and ingesting them into GIT) to creating test files only when testing in the test suite. This is what updownstream libraries like xarray and h5netcdf are doing. Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions