-
Notifications
You must be signed in to change notification settings - Fork 129
Description
Feature description
Currently, evaluation allows specifying a subset of the dataset by defining a range (e.g. data[:100]). However, this range is processed fully and without interruption. We’d like to introduce a checkpoint-based evaluation flow, where the process periodically inspects intermediate results and decides whether to continue.
For example, after evaluating a certain number of batches (as an initial, not fully thought-through idea), the system could compute an aggregated metric and compare it against developer-defined criteria. If those criteria are not met, the evaluation would stop early (e.g. after 10 or 50 examples) instead of wasting time on the remaining 1000. Conversely, if the checkpoint condition is satisfied, the evaluation proceeds to the next block.
Motivation
A checkpoint-based evaluation system would significantly reduce wasted computation time by allowing early termination when results are clearly unsatisfactory, while still enabling full evaluation when performance meets expectations.
Additional context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status