-
Notifications
You must be signed in to change notification settings - Fork 29
Description
Is your feature request related to a problem?
statis train val and test splits for testing DL models
Describe the solution you'd like
Is your feature request related to a problem?
Hi everyone,
I’d like to open the discussion about adding functionality to create static spatiotemporal train, validation, and test splits from a single large xarray dataset (e.g., global-scale data).
The goal is to generate these splits based on:
The spatial and temporal size of each sample
The stride between samples
A list of user-defined validation and test regions (as static spatiotemporal holdouts)
Potentially a land mask, used to ignore samples which only contain ocean, or land data.
The expected output would be, for each split, a list of dictionaries containing the coordinates (start and end) of each slice. This structure would make it straightforward to iterate over samples in a dataloader.
Keeping the validation and test regions static is important to ensure consistent model evaluation and comparability across experiments.
I am already working on a prototype for this with my work, and wanted to open the discussion to gather feedback and seeing if and to which library this could be a contribution!
Describe alternatives you've considered
Making a custom, its not done anywhere
Additional context
No response