Skip to content

Has this method been attempted on CogVideoX-2B? #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
songh11 opened this issue Dec 18, 2024 · 4 comments
Open

Has this method been attempted on CogVideoX-2B? #5

songh11 opened this issue Dec 18, 2024 · 4 comments

Comments

@songh11
Copy link

songh11 commented Dec 18, 2024

I have calculated the L1 error according to the formula provided in the paper, but the errors are around 0.18. I would like to know if this method supports CogVideoX-2B. Could you please provide any insights or guidance on this? Thank you.
output_1

@kelpabc123
Copy link
Contributor

Hi, sorry for the late reply. You should try different alpha values, as different architectures will require different alpha values. In theory this should work, but as this is an adjustable parameter, you should experiment with it to find a value that works for your use case! We will also have an official implementation to compute these curves and the corresponding schedule soon. Stay tuned!

@songh11
Copy link
Author

songh11 commented Jan 23, 2025

Hi, sorry for the late reply. You should try different alpha values, as different architectures will require different alpha values. In theory this should work, but as this is an adjustable parameter, you should experiment with it to find a value that works for your use case! We will also have an official implementation to compute these curves and the corresponding schedule soon. Stay tuned!

Thank you for your response. This method is applicable to the cogvideox-2B model. Additionally, I believe this method can be used together with teacache. However, there might be overlapping acceleration when using them together. Do you have any suggestions on using these two methods concurrently?❤️

@kelpabc123
Copy link
Contributor

Oops, sorry for the late reply. Thanks for pointing out TeaCache. They are very similar in implementation, although they only look at the final output rather than the outputs of individual components. If you want to stack both of them together, I recommend applying SmoothCache first then TeaCache. Let me know how it goes!

@ziyu-guo
Copy link
Contributor

Following up here, we recently release v0.1 where we added the tooling to generate caching schedules yourself. We also added an example to showcase how to do it for the Diffusers DiT pipeline. Let us know if this is helpful in your experiments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants