Always use a provided input dataset for notebook 5 in-class and homework to avoid that students use a corrupt dataset