-
Notifications
You must be signed in to change notification settings - Fork 653
Enable CUTLASS grouped GEMM for pretraining wgrad on GB200 and H100 #4886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82325651. |
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82325651. |
0fe2c3a
to
7d30e87
Compare
…ytorch#4886) Summary: Pull Request resolved: pytorch#4886 X-link: facebookresearch/FBGEMM#1910 - Enable CUTLASS grouped GEMM for llama4x pretraining wgrad on GB200 and H100 - Optimize performance of pretraining moe shapes on H100 - Support total_K in quantize_bench for wgrad Differential Revision: D82325651
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82325651. |
…ytorch#4886) Summary: Pull Request resolved: pytorch#4886 X-link: facebookresearch/FBGEMM#1910 - Enable CUTLASS grouped GEMM for llama4x pretraining wgrad on GB200 and H100 - Optimize performance of pretraining moe shapes on H100 - Support total_K in quantize_bench for wgrad Differential Revision: D82325651
7d30e87
to
cc5b529
Compare
…ytorch#4886) Summary: Pull Request resolved: pytorch#4886 X-link: facebookresearch/FBGEMM#1910 - Enable CUTLASS grouped GEMM for llama4x pretraining wgrad on GB200 and H100 - Optimize performance of pretraining moe shapes on H100 - Support total_K in quantize_bench for wgrad Reviewed By: q10 Differential Revision: D82325651
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82325651. |
cc5b529
to
b5c9a1a
Compare
This pull request has been merged in 57c2293. |
This pull request has been reverted by 53f9e51. |
Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/1910
Differential Revision: D82325651