Skip to content

Conversation

geng-meta
Copy link
Contributor

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/1930

When I was reading the code in quantize.py, the docstring in int4_row_quantize() seems inaccurate.

The int4_row_quantize() function returns wq with shape [N, K], not [N, K // 2]:

# Line 101: Concatenate chunks back to original K dimension 
out = torch.cat(out, dim=-1)  

# Line 104: Convert to int8 dtype 
out = out.to(dtype=torch.int8)`

So wq is [N, K] stored as int8 elements, where each int8 element contains a single int4 value. And after it, the actual packing could be done via:

https://www.internalfb.com/code/fbsource/[c12bfdd174f7897f4615f98b630f2ac612c471fa]/fbcode/deeplearning/fbgemm/fbgemm_gpu/experimental/gen_ai/gen_ai/quantize.py?lines=18

Differential Revision:
D82919480

Privacy Context Container: 151967047006994

Summary:
X-link: facebookresearch/FBGEMM#1930

When  I was reading the code in `quantize.py`, the docstring in int4_row_quantize() seems inaccurate.

The `int4_row_quantize()` function returns `wq` with shape `[N, K]`, not `[N, K // 2]`:

```
# Line 101: Concatenate chunks back to original K dimension 
out = torch.cat(out, dim=-1)  

# Line 104: Convert to int8 dtype 
out = out.to(dtype=torch.int8)`
```

So `wq` is `[N, K]` stored as `int8` elements, where each `int8` element contains a single int4 value. And after it, the actual packing could be done via:

https://www.internalfb.com/code/fbsource/[c12bfdd174f7897f4615f98b630f2ac612c471fa]/fbcode/deeplearning/fbgemm/fbgemm_gpu/experimental/gen_ai/gen_ai/quantize.py?lines=18

Differential Revision:
D82919480

Privacy Context Container: 151967047006994
Copy link

netlify bot commented Sep 21, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit d5c9a11
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68d0421a2731bb00087d3ede
😎 Deploy Preview https://deploy-preview-4904--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla bot added the cla signed label Sep 21, 2025
@facebook-github-bot
Copy link
Contributor

@geng-meta has exported this pull request. If you are a Meta employee, you can view the originating diff in D82919480.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 8ec3635.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants