Skip to content

Commit f922a9d

Browse files
committed
update
1 parent d839cd0 commit f922a9d

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

docs/source/en/quantization/overview.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,11 @@ There are two ways to use [`~quantizers.PipelineQuantizationConfig`] depending o
2929
Initialize [`~quantizers.PipelineQuantizationConfig`] with the following parameters.
3030

3131
- `quant_backend` specifies which quantization backend to use. Currently supported backends include: `bitsandbytes_4bit`, `bitsandbytes_8bit`, `gguf`, `quanto`, and `torchao`.
32-
- `quant_kwargs` specifies the quantization arguments to use. These arguments are different for each backend. Refer to the [Quantization API](../api/quantization) docs to view the arguments for each backend.
32+
- `quant_kwargs` specifies the quantization arguments to use.
33+
34+
> [!TIP]
35+
> These `quant_kwargs` arguments are different for each backend. Refer to the [Quantization API](../api/quantization) docs to view the arguments for each backend.
36+
3337
- `components_to_quantize` specifies which components of the pipeline to quantize. Typically, you should quantize the most compute intensive components like the transformer. The text encoder is another component to consider quantizing if a pipeline has more than one such as [`FluxPipeline`]. The example below quantizes the T5 text encoder in [`FluxPipeline`] while keeping the CLIP model intact.
3438

3539
The example below loads the bitsandbytes backend with the following arguments from [`~quantizers.quantization_config.BitsAndBytesConfig`], `load_in_4bit`, `bnb_4bit_quant_type`, and `bnb_4bit_compute_dtype`.

0 commit comments

Comments
 (0)