Skip to content

GUI task multi-image training #342

@songzijian1999

Description

@songzijian1999

作者您好,感谢您提供如此清晰的代码

我再多图训练过程中遇到了一些问题(run_grpo_gui.sh),想请教一下
具体来说,我遇到了这样的报错
exception: no description
File "/home/szj/project/solid_geo_dataset/vlmr1/src/open-r1-multimodal/src/open_r1/trainer/grpo_trainer.py", line 568, in _generate_and_score_completions
assert len(additional_output) == len(inputs)
File "/home/szj/project/solid_geo_dataset/vlmr1/src/open-r1-multimodal/src/open_r1/trainer/grpo_trainer.py", line 726, in compute_loss
inputs = self._generate_and_score_completions(inputs, model)
File "/home/szj/miniconda3/envs/rl/lib/python3.10/site-packages/transformers/trainer.py", line 3698, in training_step
loss = self.compute_loss(model, inputs, num_items_in_batch=num_items_in_batch)
File "/home/szj/miniconda3/envs/rl/lib/python3.10/site-packages/transformers/trainer.py", line 2548, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs, num_items_in_batch)
File "/home/szj/miniconda3/envs/rl/lib/python3.10/site-packages/transformers/trainer.py", line 2241, in train
return inner_training_loop(
File "/home/szj/project/solid_geo_dataset/vlmr1/src/open-r1-multimodal/src/open_r1/grpo_jsonl.py", line 1059, in main
trainer.train()
File "/home/szj/project/solid_geo_dataset/vlmr1/src/open-r1-multimodal/src/open_r1/grpo_jsonl.py", line 1073, in
main(script_args, training_args, model_args)
File "/home/szj/miniconda3/envs/rl/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/szj/miniconda3/envs/rl/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
AssertionError:

在我理解中,这意味着,比如我输入batch_size = 8个文本,以及每个文本对应2张图片,那就是batch_size = 16张图片。但是prompt_inputs, additional_output = self.vlm_module.prepare_model_inputs可能没有正确地对应图片与文本,而是仍然以 single_image 的处理思路,将每个图片与1个文本对应。最终发现文本的batch_size != 图片的batch_size

我想请问:
1.我的理解正确吗,当前预处理思路是不是没有考虑多图
2.如果需要更正,是否应该修改Qwen2_VL_processor的输入,按照Qwen2的规则来把图片与文本进行对应

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions