Imagine in Space : 视觉空间推理模型微调训练和benchmark #3041
jameslian87v5
started this conversation in
project
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
人类在视觉和空间推理当中存在视觉双通道行为,而当前多模态模型VLM 在空间推理和视觉推理方面能力仍然较弱,这是与当前多模态训练pipeline 与人类学习方式存在钆gap,为仿照人类视觉推理能力,我们构建了视觉想象和推理能力训练方法, 计划使用internVL 2.5 模型增强能力
Beta Was this translation helpful? Give feedback.
All reactions