-
Notifications
You must be signed in to change notification settings - Fork 23
Description
This is related to #99 and #97.
Now that "continue in the face of errors" feature is implemented see (#99). When IO errors happen they are logged and forgotten, however that means there is no easy way of knowing
- Whether any failures have happened
- Which inputs were corrupt
Quote from issue #99
For local load case errors can be reported via an attribute, something like
xx.load_failures
, or via thread-local global variable that reports failures for the last load that happened:odc.stac.load_failures()
. With Dask, computation is delayed and distributed, so certainly can not be done as an attribute. In case of Dask we could maintain a global dictionary mappingload-task-id -> [Failure]
, so would be something likeodc.stac.load_failures(xx)
that would lookup failures that might have happened when processingxx
, complexity with this approach is mainly to do with cache life-cycle management: when and how to purge failures that happened in the past.
Basically having an interface for error reporting that works "the same" for Dask and local load case can be tricky. Maybe we could instead have two different interfaces, one for local load and a separate one for Dask. We can start with local only, as this one will be used by Dask implementation anyway.