The CogVideoX1.5-5B-I2V of diffusers #611
liuxiaoyu1104
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
System Info / 系統信息
diffusers 0.32.0.dev0
torch 2.4.1+cu121
python 3.10.14
Information / 问题信息
Reproduction / 复现过程
I tried using CogVideoX1.5-5B-I2V and CogVideoX-5B-I2V based on CogVideoXImageToVideoPipeline(diffusers).
For CogVideoX-5B-I2V, width= 720, height = 480, num_frames = 49, num_inference_steps = 50.
For CogVideoX1.5-5B-I2V, width=1360, height=768, num_frames = 77, num_inference_steps = 50.
The generated videos of CogVideoX-5B-I2V are good.
In the generated videos of CogVideoX1.5-5B-I2V, the brightness of the first few frames is inconsistent with the images, and the latter part of the video exhibits blurriness and temporal inconsistency.
The image:

The result of CogVideoX-5B-I2V:
A.man.walking.in.the.road._480_720_49_1.0_output.mp4
The result of CogVideoX1.5-5B-I2V:
A.man.walking.in.the.road._768_1360_77_output.mp4
Expected behavior / 期待表现
The brightness of videos generated byCogVideoX1.5-5B-I2V is consistent with the images.
Beta Was this translation helpful? Give feedback.
All reactions