Skip to content

Properly handle incomplete calibration for static quantization#2460

Open
wpietka wants to merge 6 commits into
wpietkax/add-gemma-quantization-testfrom
wpietkax/fix-incomplete-static-quantization
Open

Properly handle incomplete calibration for static quantization#2460
wpietka wants to merge 6 commits into
wpietkax/add-gemma-quantization-testfrom
wpietkax/fix-incomplete-static-quantization

Conversation

@wpietka
Copy link
Copy Markdown
Contributor

@wpietka wpietka commented Apr 30, 2026

Type of Change

Bug fix

Description

Currently when calibration function for static quantization is provided with incomplete input sample - not activating all model layers during calibration - certain layers are effectively broken, due to scales being set to 0.

This change detects if quantization failed for given layer and recovers original call() methods and weights. Hence the layer will behave exactly as if it was not quantized at all

Expected Behavior & Potential Risk

Output provided by quantized model will be correctly even if quantization fails for certain layers

How has this PR been tested?

pytest test/jax/test_gemma3_model.py::test_static_quantization_with_incomplete_calibration

Dependency Change?

No dependencies changed

@wpietka wpietka force-pushed the wpietkax/add-gemma-quantization-test branch 2 times, most recently from 3385be8 to b07f4ff Compare May 4, 2026 13:40
@wpietka wpietka force-pushed the wpietkax/fix-incomplete-static-quantization branch 2 times, most recently from 36d5100 to 942cdf3 Compare May 5, 2026 08:20
@wpietka wpietka marked this pull request as ready for review May 5, 2026 08:28
Copy link
Copy Markdown
Contributor

@anko-intel anko-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to avoid adding so many load_own_variables_preprocess
Lets discuss it offline, Probably we can postpone post_quantization_cleanup actions into load_own_variables

Comment thread test/jax/test_gemma3_model.py Outdated
Comment thread neural_compressor/jax/quantization/layers_static.py
@wpietka wpietka force-pushed the wpietkax/fix-incomplete-static-quantization branch 2 times, most recently from e1b1040 to 709bc31 Compare May 7, 2026 09:14
Signed-off-by: Wojciech Piętka <wojciechx.pietka@intel.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@anko-intel anko-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks very well, thanks

Comment thread neural_compressor/jax/quantization/layers_dynamic.py Outdated
wpietka and others added 5 commits May 8, 2026 04:09
Signed-off-by: Wojciech Piętka <wojciechx.pietka@intel.com>
Signed-off-by: Wojciech Piętka <wojciechx.pietka@intel.com>
Signed-off-by: Wojciech Piętka <wojciechx.pietka@intel.com>
Signed-off-by: Wojciech Piętka <wojciechx.pietka@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants