abstract: The remarkable progress in AI across domains like language and computer vision has been largely fueled by the use of large-scale datasets for training models. However, achieving a similar transformative impact in the field of Embodied AI remains a challenge, primarily due to scaling difficulties. In this talk, I will delve into approaches that facilitate scaling in Embodied AI, including the utilization of simulation environments and the use of internet video resources. Additionally, I will discuss methods for scaling not only generic training data but also the creation and scaling of datasets, with a focus on developing a large-scale benchmark for embodied planning and reasoning in dynamic environments. I will present a comprehensive study on how state-of-the-art planning models struggle with these tasks and discuss methods for distilling knowledge from models trained on large-scale data into smaller models that are suitable for deployment on robots.
0 commit comments