Skip to content

Model fails to run on GeForce RTX 3050 (6GB VRAM) and stops automatically. #519

@roossop24

Description

@roossop24

Description:
I am attempting to run the model on my GeForce RTX 3050 GPU with 6GB of dedicated VRAM and 16GB of shared RAM. The model loads the necessary checkpoints, but it eventually stops automatically after attempting to load the checkpoint shards. It seems to be related to insufficient VRAM, as the model has large parameters (14B). The process doesn't complete successfully, and there’s no error message indicating why it fails.

Steps to Reproduce:

Run the following command:

python generate.py --task i2v-14B --size 832*480 --ckpt_dir ./Wan2.1-I2V-14B --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."

Observe that the process stops after loading the model checkpoints without an error message, around 14% of the loading process.

Expected Behavior:
The model should load and process the input image and prompt successfully.

Actual Behavior:
The model halts after loading some checkpoint shards without a clear error message.

Possible Cause:
The model is large (14 billion parameters) and may be exceeding the memory capacity of the GPU (6GB VRAM). The current GPU might not have enough memory to load and process the large model efficiently.

Suggestions:

I have tried using --offload_model=True, but the issue persists.

Could there be a memory bottleneck that needs to be addressed? Do I need a higher-end GPU (e.g., A100, RTX 3090)?

Any additional steps to reduce memory usage or optimize performance would be greatly appreciated.

System Specifications:

GPU: GeForce RTX 3050, 6GB VRAM

RAM: 16GB shared RAM

OS: Windows 10

Python Version: 3.9+

CUDA Version: 11.3 (Verified by torch.cuda.is_available())
(venv) S:\DOWNLOADS\Wan2.1-main\Wan2.1-main>python generate.py --task i2v-14B --size 832480 --ckpt_dir ./Wan2.1-I2V-14B --image examples/i2v_input.JPG --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
[2025-09-09 12:17:43,375] INFO: offload_model is not specified, set to True.
[2025-09-09 12:17:43,375] INFO: Generation job args: Namespace(task='i2v-14B', size='832
480', frame_num=81, ckpt_dir='./Wan2.1-I2V-14B', offload_model=True, ulysses_size=1, ring_size=1, t5_fsdp=False, t5_cpu=False, dit_fsdp=False, save_file=None, src_video=None, src_mask=None, src_ref_images=None, prompt="Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside.", use_prompt_extend=False, prompt_extend_method='local_qwen', prompt_extend_model=None, prompt_extend_target_lang='zh', base_seed=1494602532373852847, image='examples/i2v_input.JPG', first_frame=None, last_frame=None, sample_solver='unipc', sample_steps=40, sample_shift=3.0, sample_guide_scale=5.0)
[2025-09-09 12:17:43,375] INFO: Generation model config: {'name': 'Config: Wan I2V 14B', 't5_model': 'umt5_xxl', 't5_dtype': torch.bfloat16, 'text_len': 512, 'param_dtype': torch.bfloat16, 'num_train_timesteps': 1000, 'sample_fps': 16, 'sample_neg_prompt': '镜头晃动,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走', 't5_checkpoint': 'models_t5_umt5-xxl-enc-bf16.pth', 't5_tokenizer': 'google/umt5-xxl', 'clip_model': 'clip_xlm_roberta_vit_h_14', 'clip_dtype': torch.float16, 'clip_checkpoint': 'models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth', 'clip_tokenizer': 'xlm-roberta-large', 'vae_checkpoint': 'Wan2.1_VAE.pth', 'vae_stride': (4, 8, 8), 'patch_size': (1, 2, 2), 'dim': 5120, 'ffn_dim': 13824, 'freq_dim': 256, 'num_heads': 40, 'num_layers': 40, 'window_size': (-1, -1), 'qk_norm': True, 'cross_attn_norm': True, 'eps': 1e-06}
[2025-09-09 12:17:43,375] INFO: Input prompt: Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside.
[2025-09-09 12:17:43,375] INFO: Input image: examples/i2v_input.JPG
[2025-09-09 12:17:43,463] INFO: Creating WanI2V pipeline.
[2025-09-09 12:18:14,208] INFO: loading ./Wan2.1-I2V-14B\models_t5_umt5-xxl-enc-bf16.pth
[2025-09-09 12:20:39,286] INFO: loading ./Wan2.1-I2V-14B\Wan2.1_VAE.pth
[2025-09-09 12:20:42,767] INFO: loading ./Wan2.1-I2V-14B\models_clip_open-clip-xlm-roberta-large-vit-huge-14.pth
[2025-09-09 12:21:18,801] INFO: Creating WanModel from ./Wan2.1-I2V-14B
Loading checkpoint shards: 14%|████████████████████▌ | 1/7 [00:00<00:04, 1.31it/s]

if my GPU's memory is the problem, which type of GPU spec we will suffice for run this model

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions