Skip to content

Commit cdfd845

Browse files
authored
Merge branch 'main' into combine-optims
2 parents 80e8f3f + 62cbde8 commit cdfd845

File tree

2 files changed

+4
-1
lines changed

2 files changed

+4
-1
lines changed

.github/workflows/pr_style_bot.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,4 +14,4 @@ jobs:
1414
with:
1515
python_quality_dependencies: "[quality]"
1616
secrets:
17-
bot_token: ${{ secrets.GITHUB_TOKEN }}
17+
bot_token: ${{ secrets.HF_STYLE_BOT_ACTION }}

docs/source/en/quantization/torchao.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,9 @@ transformer = torch.compile(transformer, mode="max-autotune", fullgraph=True)
6565

6666
For speed and memory benchmarks on Flux and CogVideoX, please refer to the table [here](https://github.com/huggingface/diffusers/pull/10009#issue-2688781450). You can also find some torchao [benchmarks](https://github.com/pytorch/ao/tree/main/torchao/quantization#benchmarks) numbers for various hardware.
6767

68+
> [!TIP]
69+
> The FP8 post-training quantization schemes in torchao are effective for GPUs with compute capability of at least 8.9 (RTX-4090, Hopper, etc.). FP8 often provides the best speed, memory, and quality trade-off when generating images and videos. We recommend combining FP8 and torch.compile if your GPU is compatible.
70+
6871
torchao also supports an automatic quantization API through [autoquant](https://github.com/pytorch/ao/blob/main/torchao/quantization/README.md#autoquantization). Autoquantization determines the best quantization strategy applicable to a model by comparing the performance of each technique on chosen input types and shapes. Currently, this can be used directly on the underlying modeling components. Diffusers will also expose an autoquant configuration option in the future.
6972

7073
The `TorchAoConfig` class accepts three parameters:

0 commit comments

Comments
 (0)