I salute you for your efforts on this repository, it's was quite hard to wrap my head around it.
Here are some notes regarding my attempt at reproduction of your results:
origin_dataset split
COCO2017_train train 76960
validation 11679
COCO2017_val train 3683
validation 66
LAION-400M train 58217
validation 24198
RAISE train 5140
validation 57
BENCHMARK_PIPELINE_CFG="training_configurations/benchmark_pipelines/base_benchmark_sliding_windows.yaml"
MODEL_CFG="training_configurations/RN50_clip/RN50_clip_tune_resize.yaml"
I salute you for your efforts on this repository, it's was quite hard to wrap my head around it.
Here are some notes regarding my attempt at reproduction of your results:
Dataset creation
img2datasetfor RAISE dataset download, as the current setup does not fetch all images. I've ended up just downloading the originals and modified the code locally to load TIFs instead ofwebdataset's pngs.RAISEdataset root dir path, which I've provided.Training
openai_clipfactory methods. Fixed in Small fixes for easier reproducibility #1training_and_evaluation/experiments_logs/wandb_logs, in order to avoid the/tmpfallback, which is not an option in some cluster environments.Evaluation
General notes