-
Notifications
You must be signed in to change notification settings - Fork 396
[WIP] Add Deepcache #705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[WIP] Add Deepcache #705
Conversation
I'm very interested in this PR; I wish I had time to test DeepCache in Comfy UI and compare the results with your PR. |
@FSSRepo Thanks for your interest! Here's a comparison with ComfyUI, using the same model and parameters as above.
The results are so much better in ComfyUI compared to what I’m getting. My implementation doesn’t seem to work without CFG, which is really odd since DeepCache is supposed to be CFG-agnostic. I’m definitely doing something wrong. I'd love to continue working on this, but I’ve run out of ideas. It would be great if you could take a look and share any feedback! |
My guess is that it's sharing the same cache between uncond and conditioned pass and it's probably not supposed to. |
I tried to create a separate cache for conditional and unconditional passes, but it broke things even more. In any case, I think we should fix things with CFG first before addressing the CFG-free issue, don't think those are related |
What is CFG? According to my understanding, it's when we pass the Or is it a deepcache configuration? |
CFG means Classifier-Free guidance. It's basically a way to change how much of an effect the prompt has for conditional generation by linearly extrapolating from the conditioned prediction away from the perdition without text conditioning (or with a negative prompt). So it needs 2 forward passes at each step: 1 with the positive prompt, and 1 with empty/negative prompt. |
It's just the I did try separating the cache between the conditional and unconditional passes, but that didn’t help and in fact, it broke the case where we run with CFG > 1. From my understanding, DeepCache operates at a higher level and shouldn't be affected by this conditional/unconditional distinction stuff. Something is seriously wrong here, but I can't quite put my finger on it EDIT: I may have wrongly assumed that you're familiar with the concept of CFG, but @stduhpf already explained it well. Basically, during inference, you're doing: final_prediction = prediction_unconditional + w * (prediction_conditional - prediction_unconditional) When w = 1, you're effectively running only the conditional pass. That’s useful because it means you can double your inference speed, and distilled models support this approach. However, you do trade off some prompt fidelity when doing so. I recently read a paper that concluded CFG might actually be useless. It only appears to work because we end up using twice the compute |
This PR is currently in progress and far from complete. It adds DeepCache, a method that can be applied to U-Net architectures to skip certain blocks and reuse them in later steps in order to save compute time.
I have been inspired by this ComfyUI implementation.
It adds
--deepcache interval,depth,start,stop
arguments.Currently, it's not working well and I can't figure out why or how to achieve better results. I have been debugging the cache step and counter logic for a week, but the issue seems to be more subtle than that.
Command example:
./build/bin/sd -m ../models/realisticVisionV60B1_v51HyperVAE.safetensors -v -p "cute cat" --cfg-scale 2.5 --steps 8 --deepcache 2,3,0,8
--deepcache 2,3,0,8
--deepcache 3,3,0,8
If someone could help by taking a look or continue the work, I would be grateful. Otherwise, I don't think I'll spend more time on it.