FP4 grouped refactor by cthi · Pull Request #4934 · pytorch/FBGEMM

cthi · 2025-09-24T17:54:55Z

Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/1957

Split some clean-up/refactors from the core FP4 Torch API support to make the next diff more focused.

Removed zero_start_index_M as it's unused
Removed passing G into the kernel directly as it can be inferred
Rename ElementComputeEpilogue -> ElementScale
Add namespace fbgemm_gpu in f4f4bf16_grouped_common.cuh
Removed num_x_scale_per_group and num_w_scale_per_group as they are both unused
Removed un-neccesary cutlass headers in f4f4bf16_grouped.cu

Differential Revision: D83166227

Summary: X-link: facebookresearch/FBGEMM#1956 as title Differential Revision: D83083612

) Summary: X-link: facebookresearch/FBGEMM#1955 We would reuse this kernel as a base to add support for NV/MX FP4. As a first step, shuffle it into it's own file. Differential Revision: D83151150

Summary: X-link: facebookresearch/FBGEMM#1957 Split some clean-up/refactors from the core FP4 Torch API support to make the next diff more focused. - Removed `zero_start_index_M` as it's unused - Removed passing `G` into the kernel directly as it can be inferred - Rename `ElementComputeEpilogue` -> `ElementScale` - Add `namespace fbgemm_gpu` in `f4f4bf16_grouped_common.cuh` - Removed `num_x_scale_per_group` and `num_w_scale_per_group` as they are both unused - Removed un-neccesary cutlass headers in `f4f4bf16_grouped.cu` Differential Revision: D83166227

netlify · 2025-09-24T17:55:00Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`553b40e`
🔍 Latest deploy log	https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68d4307293d0e60008ef1fa7
😎 Deploy Preview	https://deploy-preview-4934--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

facebook-github-bot · 2025-09-24T17:55:08Z

@cthi has exported this pull request. If you are a Meta employee, you can view the originating diff in D83166227.

facebook-github-bot · 2025-09-25T14:27:05Z

This pull request has been merged in d9b6e01.

cthi added 3 commits September 24, 2025 10:54

Fix typo in FP4 quantize (pytorch#4933)

4cdb116

Summary: X-link: facebookresearch/FBGEMM#1956 as title Differential Revision: D83083612

Split grouped gemm metadata kernel into grouped_common.cuh (pytorch#4932

cf2dc81

) Summary: X-link: facebookresearch/FBGEMM#1955 We would reuse this kernel as a base to add support for NV/MX FP4. As a first step, shuffle it into it's own file. Differential Revision: D83151150

meta-cla bot added the cla signed label Sep 24, 2025

facebook-github-bot added fb-exported meta-exported labels Sep 24, 2025

facebook-github-bot closed this in d9b6e01 Sep 25, 2025

facebook-github-bot added the Merged label Sep 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP4 grouped refactor#4934

FP4 grouped refactor#4934
cthi wants to merge 3 commits intopytorch:mainfrom
cthi:export-D83166227

cthi commented Sep 24, 2025

Uh oh!

netlify bot commented Sep 24, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 24, 2025

Uh oh!

facebook-github-bot commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cthi commented Sep 24, 2025

Uh oh!

netlify bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Uh oh!

facebook-github-bot commented Sep 24, 2025

Uh oh!

facebook-github-bot commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

netlify bot commented Sep 24, 2025 •

edited

Loading