This repository maintains a curated and updated collection of research papers and repositories focused on Reinforcement Learning (RL)-inspired methods such as Group Relative Policy Optimization (GRPO) applied to Vision and Multimodal Reasoning tasks.
- R1-V: Reinforcing Super Generalization Ability in Vision Language Models with Less Than $3
- Multimodal Open R1
- Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
- Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
- 100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models
- Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)
- A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond | Github
- Aligning Multimodal LLM with Human Preference: A Survey | Github