Currently for datasets with bounding boxes, we need to specify the max bounding boxes possible so all output batches are of the same size:
What we should do is use a custom collate function in the DataLoader like used in the Pytorch detection tutorial:
https://github.com/pytorch/vision/blob/6c2cda6a0eda4c835f96f18bb2b3be5043d96ad2/references/detection/utils.py#L237
https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html
Currently for datasets with bounding boxes, we need to specify the max bounding boxes possible so all output batches are of the same size:
ViP/datasets/ImageNetVID.py
Line 27 in 74776f2
What we should do is use a custom collate function in the DataLoader like used in the Pytorch detection tutorial:
https://github.com/pytorch/vision/blob/6c2cda6a0eda4c835f96f18bb2b3be5043d96ad2/references/detection/utils.py#L237
https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html