-
Notifications
You must be signed in to change notification settings - Fork 126
Open
Labels
bugSomething isn't workingSomething isn't working
Description
- Looks like AWQ does not honor the
layer_modules_strict=Falsewhen certain modules are not placed on every layer.
Stacktrace:
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/module_looper.py", line 1156, in loop
return self._loop_impl(fail_safe=fail_safe, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/module_looper.py", line 1301, in _loop_impl
processor.layer_quantize(module, cur_layer_device, named_childs)
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/awq_processor.py", line 346, in layer_quantize
module_config: List[Dict] = self.gptq_model.awq_get_modules_for_scaling(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
module, input_feat, self.module_kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/base.py", line 1400, in awq_get_modules_for_scaling
inp = input_feat[block[0]]
~~~~~~~~~~^^^^^^^^^^
KeyError: 'mlp.shared_experts.gate_proj'
- It also does not handle
dynamicexclusions well (when certain modules excluded from quantization), for example I have excluded self_attn:
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/base.py", line 1021, in quantize
result = module_looper.loop(
backend=backend,
fail_safe=self.quantize_config.fail_safe,
)
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/module_looper.py", line 1156, in loop
return self._loop_impl(fail_safe=fail_safe, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/module_looper.py", line 1301, in _loop_impl
processor.layer_quantize(module, cur_layer_device, named_childs)
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/awq_processor.py", line 346, in layer_quantize
module_config: List[Dict] = self.gptq_model.awq_get_modules_for_scaling(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
module, input_feat, self.module_kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/base.py", line 1400, in awq_get_modules_for_scaling
inp = input_feat[block[0]]
~~~~~~~~~~^^^^^^^^^^
KeyError: 'self_attn.q_proj'
Qubitium
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working