Hi, thanks for releasing M2Diffuser and the official checkpoints!
I attempted to reproduce the results reported in the paper using the provided inference scripts and pretrained checkpoints, running in PyBullet mode (headless). However, I am observing noticeably different success rates across all three tasks: Pick, Place, and Goal-Reach.
Command I used to run inference:
bash ./scripts/model-m2diffuser/<task>/inference.sh checkpoints/checkpoints/MK-M2Diffuser-<task>/2024-07-14-09-38-10/
My System(Docker):
OS: Ubuntu 20.04
Python: 3.8
PyTorch: 1.13.1+cu116
PyBullet: 3.2.6
sim_gui: true
So I just wanted to know:
- Are there any known sources of variation between the inference scripts and the final paper results?
- Could the discrepancy be due to the use of PyBullet instead of Isaac Sim for evaluation? Was the paper's reported success based on Isaac Sim or PyBullet inference?