I noticed that there is a deviation from standard practice, during test time, the accuracy metric is just using 'top1' accuracy and it is not getting averaged across classes!
In short, two questions:
-
Why only top1 during test time? (and then, what does "origin" means, does that also mean Top1 test accu.)

-
Why not use the metrics from pytorch?