Skip to content

ValueError: too many values to unpack (expected 2) when testing Mvbench dataset with demo_mistral_hd.ipynb #263

@shaneyale2005

Description

@shaneyale2005

When I was testing the Mvbench dataset using demo_mistral_hd.ipynb based on mistral, I copied the test code from the mvbench.ipynb file and configured the dataset properly, but it showed such an error. Why is this?

`data_list = {
"Action Sequence": ("action_sequence.json", "/home/user/zhengxinye/datasets/mvbench/video/star/Charades_v1_480/", "video", True), # has start & end
"Action Prediction": ("action_prediction.json", "/home/user/zhengxinye/datasets/mvbench/video/star/Charades_v1_480/", "video", True), # has start & end
"Action Antonym": ("action_antonym.json", "/home/user/zhengxinye/datasets/mvbench/video/ssv2_video/", "video", False),
"Fine-grained Action": ("fine_grained_action.json", "/home/user/zhengxinye/datasets/mvbench/video/Moments_in_Time_Raw/videos/", "video", False),
"Unexpected Action": ("unexpected_action.json", "/home/user/zhengxinye/datasets/mvbench/video/FunQA_test/test/", "video", False),
"Object Existence": ("object_existence.json", "/home/user/zhengxinye/datasets/mvbench/video/clevrer/video_validation/", "video", False),
"Object Interaction": ("object_interaction.json", "/home/user/zhengxinye/datasets/mvbench/video/star/Charades_v1_480/", "video", True), # has start & end
"Object Shuffle": ("object_shuffle.json", "/home/user/zhengxinye/datasets/mvbench/video/perception/videos/", "video", False),
"Moving Direction": ("moving_direction.json", "/home/user/zhengxinye/datasets/mvbench/video/clevrer/video_validation/", "video", False),
"Action Localization": ("action_localization.json", "/home/user/zhengxinye/datasets/mvbench/video/sta/sta_video/", "video", True), # has start & end
"Scene Transition": ("scene_transition.json", "/home/user/zhengxinye/datasets/mvbench/video/scene_qa/video/", "video", False),
"Action Count": ("action_count.json", "/home/user/zhengxinye/datasets/mvbench/video/perception/videos/", "video", False),
"Moving Count": ("moving_count.json", "/home/user/zhengxinye/datasets/mvbench/video/clevrer/video_validation/", "video", False),
"Moving Attribute": ("moving_attribute.json", "/home/user/zhengxinye/datasets/mvbench/video/clevrer/video_validation/", "video", False),
"State Change": ("state_change.json", "/home/user/zhengxinye/datasets/mvbench/video/perception/videos/", "video", False),
"Fine-grained Pose": ("fine_grained_pose.json", "/home/user/zhengxinye/datasets/mvbench/video/nturgbd/", "video", False),
"Character Order": ("character_order.json", "/home/user/zhengxinye/datasets/mvbench/video/perception/videos/", "video", False),
"Egocentric Navigation": ("egocentric_navigation.json", "/home/user/zhengxinye/datasets/mvbench/video/vlnqa/", "video", False),
"Episodic Reasoning": ("episodic_reasoning.json", "/home/user/zhengxinye/datasets/mvbench/video/tvqa/frames_fps3_hq/", "frame", True), # has start & end, read frame
"Counterfactual Inference": ("counterfactual_inference.json", "/home/user/zhengxinye/datasets/mvbench/video/clevrer/video_validation/", "video", False),
}

data_dir = "/home/user/zhengxinye/datasets/mvbench/json"`

The error prompt is as follows:
`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[67], line 14
12 acc_dict[task_type][1] += 1
13 total += 1
---> 14 pred = infer_mvbench(
15 example,
16 system="Carefully watch the video and pay attention to the cause and sequence of events, the detail and movement of objects, and the action and pose of persons. Based on your observations, select the best option that accurately addresses the question.\n",
17 question_prompt="\nOnly give the best option.",
18 answer_prompt="Best option:(",
19 return_prompt='(',
20 system_q=False,
21 print_res=True,
22 system_llm=True
23 )
24 gt = example['answer']
25 res_list.append({
26 'pred': pred,
27 'gt': gt
28 })

Cell In[65], line 19, in infer_mvbench(data_sample, system, question_prompt, answer_prompt, return_prompt, system_q, print_res, system_llm)
17 video_emb, _ = model.encode_img(video, system + data_sample['question'])
18 else:
---> 19 video_emb, _ = model.encode_img(video, system)
20 video_list.append(video_emb)
21 # video_list.append(torch.zeros_like(video_emb))

ValueError: too many values to unpack (expected 2)`

May I ask where the problem is? Please help solve it. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions