Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Commit 2432cf4

Browse files
author
Sara Adkins
committed
address PR comments
1 parent 579d201 commit 2432cf4

File tree

3 files changed

+18
-144
lines changed

3 files changed

+18
-144
lines changed

src/sparseml/modifiers/quantization_vllm/base.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,9 @@ class vLLMQuantizationModifier(Modifier):
3535
modifier will be enabled until training is completed.
3636
3737
:param config_groups: dictionary specifying quantization schemes to apply to target
38-
modules. Modules not matching a scheme target will NOT be quantized.
38+
modules. Modules not matching a scheme target will NOT be quantized.
3939
:param ignore: optional list of module class names or submodule names to not
40-
quantize even if they match a target in config_groups. Defaults to empty list.
40+
quantize even if they match a target in config_groups. Defaults to empty list.
4141
:param disable_quantization_observer_epoch: Epoch to disable updates to the module
4242
quantization observers. At this point, quantized weights and zero points will
4343
not be updated. Leave None to not disable observers during QAT. Default is None

src/sparseml/modifiers/quantization_vllm/pytorch.py

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -32,17 +32,22 @@
3232

3333
class vLLMQuantizationModifierPyTorch(vLLMQuantizationModifier):
3434
"""
35-
Pytorch-specific implementation of quantization modifier
36-
37-
:param scheme: Default QuantizationScheme to use when enabling quantization
38-
in a module. May also be a dictionary to be loaded into the QuantizationScheme
39-
class. A string alias may also be used, supported aliases:
40-
['default', 'deepsparse', 'tensorrt'].
41-
If None, the default scheme (`QuantizationScheme()`) will be used.
42-
Default is None
43-
:param scheme_overrides: optional mapping of module type names or submodule type
44-
names to quantization schemes to override them with. If a scheme is mapped to
45-
'default', then it will use the scheme set in the mo difier scheme property
35+
PyTorch specific implementation of vLLMQuantizationModifier
36+
37+
Enables post training quantization (PTQ) and quantization aware training (QAT) for a
38+
given module or its submodules. After calibration (PTQ) or the start epoch (QAT),
39+
the specified module(s) forward pass will emulate quantized execution and the
40+
modifier will be enabled until training is completed.
41+
42+
:param config_groups: dictionary specifying quantization schemes to apply to target
43+
modules. Modules not matching a scheme target will NOT be quantized.
44+
:param ignore: optional list of module class names or submodule names to not
45+
quantize even if they match a target in config_groups. Defaults to empty list.
46+
:param disable_quantization_observer_epoch: Epoch to disable updates to the module
47+
quantization observers. At this point, quantized weights and zero points will
48+
not be updated. Leave None to not disable observers during QAT. Default is None
49+
:param num_calibration_steps: Number of steps to run post training calibration for.
50+
When None, the entire calibration_dataloader is used
4651
"""
4752

4853
calibration_dataloader_: Any = None

tests/sparseml/transformers/compression/test_compress_tensor_utils.py

Lines changed: 0 additions & 131 deletions
This file was deleted.

0 commit comments

Comments
 (0)