[BUG] AWQ quantization fail for GLM-4.5-Air

1. Looks like AWQ does not honor the `layer_modules_strict=False` when certain modules are not placed on every layer.

Stacktrace:
```
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/module_looper.py", line 1156, in loop
    return self._loop_impl(fail_safe=fail_safe, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/module_looper.py", line 1301, in _loop_impl
    processor.layer_quantize(module, cur_layer_device, named_childs)
    ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/awq_processor.py", line 346, in layer_quantize
    module_config: List[Dict] = self.gptq_model.awq_get_modules_for_scaling(
                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        module, input_feat, self.module_kwargs
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/base.py", line 1400, in awq_get_modules_for_scaling
    inp = input_feat[block[0]]
          ~~~~~~~~~~^^^^^^^^^^
KeyError: 'mlp.shared_experts.gate_proj'
```

2. It also does not handle `dynamic` exclusions well (when certain modules excluded from quantization), for example I have excluded self_attn:
```
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/base.py", line 1021, in quantize
    result = module_looper.loop(
        backend=backend,
        fail_safe=self.quantize_config.fail_safe,
    )
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/module_looper.py", line 1156, in loop
    return self._loop_impl(fail_safe=fail_safe, **kwargs)
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/module_looper.py", line 1301, in _loop_impl
    processor.layer_quantize(module, cur_layer_device, named_childs)
    ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/looper/awq_processor.py", line 346, in layer_quantize
    module_config: List[Dict] = self.gptq_model.awq_get_modules_for_scaling(
                                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        module, input_feat, self.module_kwargs
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/ubuntu/venvs/gptqmodelt/lib/python3.13t/site-packages/gptqmodel/models/base.py", line 1400, in awq_get_modules_for_scaling
    inp = input_feat[block[0]]
          ~~~~~~~~~~^^^^^^^^^^
KeyError: 'self_attn.q_proj'
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] AWQ quantization fail for GLM-4.5-Air #2114

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] AWQ quantization fail for GLM-4.5-Air #2114

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions