Open
Description
When training a new forest, we usually observe a property of diminishing returns from adding new trees. We can use this property to develop a stopping criterion to accelerate forest training. The idea is to find a parametric model that describes the dependency of the error curve on the number of trees in the forest, fit this curve to the current error when training a forest and identify when the curve starts to "flatten." This point is used to stop adding new trees.
Subtasks
- Instrument the benchmark suite to collect a dataset with error curves from training forests (label each curve with the dataset it was collected from)