https://github.com/AIGeeksGroup/3D-R1/blob/ed95e2487f4bc1a00e1b81896b2b982463166bd4/models/rl/grpo_trainer.py#L68C1-L69C49 Can you please check this part?