-
Notifications
You must be signed in to change notification settings - Fork 310
Open
Description
Although fake_quantizer implements forwarding method for all three kinds of granularities, only per_group can be used for linear layer weight in QAT. This is because when FakeQuantizedLinear's __init__
is called, it first trys to get the group size in weight config:
ao/torchao/quantization/qat/linear.py
Lines 84 to 94 in 2c901b3
# initialize weight fake quantizer | |
if weight_config is not None: | |
group_size = weight_config.group_size | |
if group_size is not None and in_features % group_size != 0: | |
raise ValueError( | |
"in_features (%s) %% group_size (%s) must be == 0" | |
% (in_features, group_size) | |
) | |
self.weight_fake_quantizer = FakeQuantizer(weight_config) | |
else: | |
self.weight_fake_quantizer = None |
And if the weight config use any granularity other than per_group, an exception will be raised
ao/torchao/quantization/qat/api.py
Lines 216 to 226 in 2c901b3
def group_size(self) -> int: | |
""" | |
If this is per group granularity, return the group size. | |
Otherwise, throw an error. | |
""" | |
if isinstance(self.granularity, PerGroup): | |
return self.granularity.group_size | |
else: | |
raise ValueError( | |
"`group_size` is undefined for %s granularity" % self.granularity | |
) |
An easy fix can be checking granularity type before getting group size
# initialize weight fake quantizer
if weight_config is not None:
if isinstance(weight_config.granularity, PerGroup):
group_size = weight_config.group_size
if group_size is not None and in_features % group_size != 0:
raise ValueError(
"in_features (%s) %% group_size (%s) must be == 0"
% (in_features, group_size)
)
self.weight_fake_quantizer = FakeQuantizer(weight_config)
else:
self.weight_fake_quantizer = None