[bc-breaking] Generalize FakeQuantizeConfig beyond intx #2628

andrewor14 · 2025-07-29T20:02:44Z

Stack from ghstack (oldest at bottom):

Summary: The existing FakeQuantizeConfig performs only
intx quantization, but we plan to extend QAT to other dtypes
such as fp8 and nvfp4 in the near future. This is the necessary
refactor before that. Specifically:

# New abstract class
FakeQuantizeConfigBase
# Rename
FakeQuantizeConfig -> IntxFakeQuantizeConfig

In the future, we will have other types of FakeQuantizeConfigBase
for float dtypes that users can pass in instead of the existing
Intx one.

BC-breaking notes: For BC, we keep around the old names to
reference the new ones. However, this commit is still BC-breaking
in the sense that a few APIs now accept the abstract
FakeQuantizeConfigBase instead. For the most part, this abstract
class will be hidden from the user.

Before:

activation_config = FakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False)
weight_config = FakeQuantizeConfig(torch.int4, group_size=32)

After:

activation_config = IntxFakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False)
weight_config = IntxFakeQuantizeConfig(torch.int4, group_size=32)

Test Plan:
python test/quantization/test_qat.py

**Summary:** The existing `FakeQuantizeConfig` performs only intx quantization, but we plan to extend QAT to other dtypes such as fp8 and nvfp4 in the near future. This is the necessary refactor before that. Specifically: ``` # New abstract class FakeQuantizeConfigBase # Rename FakeQuantizeConfig -> IntxFakeQuantizeConfig ``` In the future, we will have other types of `FakeQuantizeConfigBase` for float dtypes that users can pass in instead of the existing Intx one. **BC-breaking notes:** For BC, we keep around the old names to reference the new ones. However, this commit is still BC-breaking in the sense that a few APIs now accept the abstract `FakeQuantizeConfigBase` instead. For the most part, this abstract class will be hidden from the user. Before: ``` activation_config = FakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False) weight_config = FakeQuantizeConfig(torch.int4, group_size=32) ``` After: ``` activation_config = IntxFakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False) weight_config = IntxFakeQuantizeConfig(torch.int4, group_size=32) ``` **Test Plan:** python test/quantization/test_qat.py [ghstack-poisoned]

pytorch-bot · 2025-07-29T20:02:48Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2628

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 8245cee with merge base 2f8fd69 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/wh... / linux-job (gh)
RuntimeError: Command docker exec -t ad491fc5245f21d7bc9db030126d54e52ef3ad7947359181162bd150f1cfbe51 /exec failed with exit code 2
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
RuntimeError: Command docker exec -t 5bc51ee9d812ad79cfdf8b9d3224885210906420c796d516047471f9e2a47c9f /exec failed with exit code 2

This comment was automatically generated by Dr. CI and updates every 15 minutes.

drisspg · 2025-07-30T00:33:37Z

Just to confirm, we are changing the name of FakeQuantizeConfig to intxFakeQuantizeConfig, But we are also keeping around the fake quantize config object as a rename of int x fake quantize config ? And then in two releases we will remove it?

andrewor14 · 2025-07-30T13:08:25Z

Just to confirm, we are changing the name of FakeQuantizeConfig to intxFakeQuantizeConfig, But we are also keeping around the fake quantize config object as a rename of int x fake quantize config ? And then in two releases we will remove it?

Yes keeping around FakeQuantizeConfig is just for BC. I think many users are using it today (it's part of our recommended API). Will deprecate in a future PR

**Summary:** The existing `FakeQuantizeConfig` performs only intx quantization, but we plan to extend QAT to other dtypes such as fp8 and nvfp4 in the near future. This is the necessary refactor before that. Specifically: ``` # New abstract class FakeQuantizeConfigBase # Rename FakeQuantizeConfig -> IntxFakeQuantizeConfig ``` In the future, we will have other types of `FakeQuantizeConfigBase` for float dtypes that users can pass in instead of the existing Intx one. **BC-breaking notes:** For BC, we keep around the old names to reference the new ones. However, this commit is still BC-breaking in the sense that a few APIs now accept the abstract `FakeQuantizeConfigBase` instead. For the most part, this abstract class will be hidden from the user. Before: ``` activation_config = FakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False) weight_config = FakeQuantizeConfig(torch.int4, group_size=32) ``` After: ``` activation_config = IntxFakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False) weight_config = IntxFakeQuantizeConfig(torch.int4, group_size=32) ``` **Test Plan:** python test/quantization/test_qat.py [ghstack-poisoned]

drisspg · 2025-07-31T15:50:56Z

torchao/quantization/qat/fake_quantize_config.py

+)
+
+
+@dataclass


I dont think you need this dataclass decorator here

drisspg · 2025-07-31T15:56:37Z

torchao/quantization/qat/fake_quantize_config.py

+
+    def __init__(
+        self,
+        dtype: Union[torch.dtype, TorchAODType],


I wonder if you can type this as Literal[...] so that it only allows for int inputs

we actually have a lot of dtypes we allow, like all of int2-8 and uint2-8, will be too verbose I think

just define a Allowed Types above and use it

Hmm I just tried it but didn't really like it. I think I prefer a simpler signature like just torch.dtype (we can drop TorchAODType soon, only needed for pytorch 2.5 and before) and do the validation in init

drisspg · 2025-07-31T15:56:51Z

torchao/quantization/qat/fake_quantize_config.py

+        self.eps = eps
+
+        # Validate dtype
+        all_dtypes = [torch.int8, torch.uint8]


similar to this

**Summary:** The existing `FakeQuantizeConfig` performs only intx quantization, but we plan to extend QAT to other dtypes such as fp8 and nvfp4 in the near future. This is the necessary refactor before that. Specifically: ``` # New abstract class FakeQuantizeConfigBase # Rename FakeQuantizeConfig -> IntxFakeQuantizeConfig ``` In the future, we will have other types of `FakeQuantizeConfigBase` for float dtypes that users can pass in instead of the existing Intx one. **BC-breaking notes:** For BC, we keep around the old names to reference the new ones. However, this commit is still BC-breaking in the sense that a few APIs now accept the abstract `FakeQuantizeConfigBase` instead. For the most part, this abstract class will be hidden from the user. Before: ``` activation_config = FakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False) weight_config = FakeQuantizeConfig(torch.int4, group_size=32) ``` After: ``` activation_config = IntxFakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False) weight_config = IntxFakeQuantizeConfig(torch.int4, group_size=32) ``` **Test Plan:** python test/quantization/test_qat.py [ghstack-poisoned]

**Summary:** Similar to #2628, but for `FakeQuantizer`. It is cleaner to isolate the logic of each quantizer in separate classes, e.g. intx vs nvfp4 vs fp8. Naming change: ``` FakeQuantizer -> IntxFakeQuantizer ``` **BC-breaking notes:** This is technically not BC-breaking yet since we are just deprecating the old APIs while keeping them around. It will be when we do remove the old APIs in the future according to #2630. Before: ``` config = IntxFakeQuantizeConfig(torch.int8, "per_channel") FakeQuantizer(config) ``` After: ``` config = IntxFakeQuantizeConfig(torch.int8, "per_channel") IntxFakeQuantizer(config) # or FakeQuantizerBase.from_config(config) ``` **Test Plan:** ``` python test/quantization/test_qat.py ``` [ghstack-poisoned]

**Summary:** Similar to #2628, but for `FakeQuantizer`. It is cleaner to isolate the logic of each quantizer in separate classes, e.g. intx vs nvfp4 vs fp8. Naming change: ``` FakeQuantizer -> IntxFakeQuantizer ``` **BC-breaking notes:** This is technically not BC-breaking yet since we are just deprecating the old APIs while keeping them around. It will be when we do remove the old APIs in the future according to #2630. Before: ``` config = IntxFakeQuantizeConfig(torch.int8, "per_channel") FakeQuantizer(config) ``` After: ``` config = IntxFakeQuantizeConfig(torch.int8, "per_channel") IntxFakeQuantizer(config) # or FakeQuantizerBase.from_config(config) ``` **Test Plan:** ``` python test/quantization/test_qat.py ``` ghstack-source-id: 3867fab Pull Request resolved: #2714

**Summary:** Similar to #2628, but for `FakeQuantizer`. It is cleaner to isolate the logic of each quantizer in separate classes, e.g. intx vs nvfp4 vs fp8. Naming change: ``` FakeQuantizer -> IntxFakeQuantizer ``` **BC-breaking notes:** This is technically not BC-breaking yet since we are just deprecating the old APIs while keeping them around. It will be when we do remove the old APIs in the future according to #2630. Before: ``` config = IntxFakeQuantizeConfig(torch.int8, "per_channel") FakeQuantizer(config) ``` After: ``` config = IntxFakeQuantizeConfig(torch.int8, "per_channel") IntxFakeQuantizer(config) # or FakeQuantizerBase.from_config(config) ``` **Test Plan:** ``` python test/quantization/test_qat.py ``` [ghstack-poisoned]

andrewor14 mentioned this pull request Jul 29, 2025

New multi-step QAT API #2629

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 29, 2025

andrewor14 added the topic: bc-breaking Use this tag if this PR breaks backward compatibility label Jul 29, 2025

andrewor14 requested review from jerryzh168 and drisspg July 29, 2025 20:03

andrewor14 mentioned this pull request Jul 30, 2025

Deprecate old QAT APIs #2641

Merged

andrewor14 added 2 commits July 30, 2025 15:58

drisspg reviewed Jul 31, 2025

View reviewed changes

drisspg approved these changes Jul 31, 2025

View reviewed changes

jerryzh168 approved these changes Jul 31, 2025

View reviewed changes

andrewor14 changed the base branch from gh/andrewor14/13/base to main August 1, 2025 15:07

andrewor14 merged commit 97b090d into main Aug 1, 2025
36 of 40 checks passed

andrewor14 mentioned this pull request Aug 7, 2025

Generalize FakeQuantizer beyond intx #2714

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bc-breaking] Generalize FakeQuantizeConfig beyond intx #2628

[bc-breaking] Generalize FakeQuantizeConfig beyond intx #2628

Uh oh!

andrewor14 commented Jul 29, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 29, 2025 •

edited

Loading

Uh oh!

drisspg commented Jul 30, 2025

Uh oh!

andrewor14 commented Jul 30, 2025

Uh oh!

drisspg Jul 31, 2025

Uh oh!

drisspg Jul 31, 2025

Uh oh!

andrewor14 Jul 31, 2025

Uh oh!

drisspg Jul 31, 2025

Uh oh!

andrewor14 Jul 31, 2025

Uh oh!

drisspg Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

		)


		@dataclass

[bc-breaking] Generalize FakeQuantizeConfig beyond intx #2628

[bc-breaking] Generalize FakeQuantizeConfig beyond intx #2628

Uh oh!

Conversation

andrewor14 commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2628

❌ 2 New Failures

Uh oh!

drisspg commented Jul 30, 2025

Uh oh!

andrewor14 commented Jul 30, 2025

Uh oh!

drisspg Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

drisspg Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

andrewor14 Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

drisspg Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

andrewor14 Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

drisspg Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

andrewor14 commented Jul 29, 2025 •

edited

Loading

pytorch-bot bot commented Jul 29, 2025 •

edited

Loading