Skip to content

A list of papers, docs, codes about diffusion quantization.This repo collects various quantization methods for the Diffusion Models. Welcome to PR the works (papers, repositories) missed by the repo.

Notifications You must be signed in to change notification settings

wlfeng0509/Awesome-Diffusion-Quantization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

Awesome-Diffusion-Quantization Awesome

A list of papers, docs, codes about diffusion quantization. This repo collects various quantization methods for the Diffusion Models. Welcome to PR the works (papers, repositories) missed by the repo.

Contents

Papers

2026

  • [AAAI] TR-DQ: Time-Rotation Diffusion Quantization

  • [ICLR] QuantSparse: Comprehensively Compressing Video Diffusion Transformer with Model Quantization and Attention Sparsification [code]

  • [ICLR] QVGen: Pushing the Limit of Quantized Video Generative Models[code]

  • [ICLR] Q&C: When Quantization Meets Cache in Efficient Image Generation

  • [ICLR] DVD-Quant: Data-free Video Diffusion Transformers Quantization [code]

2025

  • [ICLR] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation [code]
  • [ICLR] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models [code]
  • [ICLR] BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models [code]
  • [ICLR] SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration [code]
  • [CVPR] Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers [code]
  • [CVPR] CacheQuant: Comprehensively Accelerated Diffusion Models [code]
  • [CVPR] PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution [code]
  • [ICML] Q-VDiT: Towards Accurate Quantization and Distillation of Video-Generation Diffusion Transformers [code]
  • [ICML] SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization [code]
  • [ICCV] Text Embedding Knows How to Quantize Text-Guided Diffusion Models
  • [ICCV] QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning [code]
  • [ICCV] DMQ: Dissecting Outliers of Diffusion Models for Post-Training Quantization[code]
  • [ICCV] QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation [code]
  • [NeurIPS] PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models
  • [NeurIPS] S2Q-VDiT: Accurate Quantized Video Diffusion Transformer with Salient Data and Sparse Token Distillation[code]
  • [NeurIPS] AccuQuant: Simulating Multiple Denoising Steps for Quantizing Diffusion Models
  • [WACV] DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing [code]
  • [ISCAS] CDM-QTA: Quantized Training Acceleration for Efficient LoRA Fine-Tuning of Diffusion Model
  • [Arxiv] Post-Training Quantization for Diffusion Transformer via Hierarchical Timestep Grouping
  • [Arxiv] TQ-DiT: Efficient Time-Aware Quantization for Diffusion Transformers
  • [Arxiv] FP4DiT: Towards Effective Floating Point Quantization for Diffusion Transformers [code]
  • [Arxiv] Quantizing Diffusion Models from a Sampling-Aware Perspective
  • [Arxiv] QArtSR: Quantization via Reverse-Module and Timestep-Retraining in One-Step Diffusion based Image Super-Resolution [code]
  • [Arxiv] Pioneering 4-Bit FP Quantization for Diffusion Models: Mixup-Sign Quantization and Timestep-Aware Fine-Tuning
  • [Arxiv] MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation
  • [Arxiv] RobuQ: Pushing DiTs to W1.58A2 via Robust Activation Quantization[code]
  • [Arxiv] CLQ: Cross-Layer Guided Orthogonal-based Quantization for Diffusion Transformers[code]
  • [Arxiv] TreeQ: Pushing the Quantization Boundary of Diffusion Transformer via Tree-Structured Mixed-Precision Search[code]
  • [Arxiv] ConvRot: Rotation-Based Plug-and-Play 4-bit Quantization for Diffusion Transformers

2024

  • [ICLR] EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models [code]
  • [CVPR] TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models [code]
  • [CVPR] Towards Accurate Post-training Quantization for Diffusion Models [code]
  • [ECCV] MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization [code]
  • [ECCV] Timestep-Aware Correction for Quantized Diffusion Models
  • [ECCV] Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing [code]
  • [ECCV] Memory-Efficient Fine-Tuning for Quantized Diffusion Model [code]
  • [NeurIPS] PTQ4DiT: Post-training Quantization for Diffusion Transformers [code]
  • [NeurIPS] BitsFusion: 1.99 bits Weight Quantization of Diffusion Model [code]
  • [NeurIPS] TerDiT: Ternary Diffusion Models with Transformers [code]
  • [NeurIPS] Binarized Diffusion Model for Image Super-Resolution [code]
  • [NeurIPS] BiDM: Pushing the Limit of Quantization for Diffusion Models [code]
  • [NeurIPS] StepbaQ: Stepping backward as Correction for Quantized Diffusion Models
  • [AAAI] MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models [code]
  • [AAAI] Qua2SeDiMo: Quantifiable Quantization Sensitivity of Diffusion Models [code]
  • [AAAI] TCAQ-DM: Timestep-Channel Adaptive Quantization for Diffusion Models
  • [AAAI] Optimizing Quantized Diffusion Models via Distillation with Cross-Timestep Error Correction
  • [Arxiv] HQ-DiT: Efficient Diffusion Transformer with FP4 Hybrid Quantization
  • [Arxiv] VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers
  • [Arxiv] TaQ-DiT: Time-aware Quantization for Diffusion Transformers [code]

2023

  • [ICCV] Q-Diffusion: Quantizing Diffusion Models [code]
  • [CVPR] Post-training Quantization on Diffusion Models [code]
  • [NeurIPS] PTQD: Accurate Post-Training Quantization for Diffusion Models [code]
  • [NeurIPS] Q-DM: An Efficient Low-bit Quantized Diffusion Model
  • [NeurIPS] Temporal Dynamic Quantization for Diffusion Models

About

A list of papers, docs, codes about diffusion quantization.This repo collects various quantization methods for the Diffusion Models. Welcome to PR the works (papers, repositories) missed by the repo.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published