-
Notifications
You must be signed in to change notification settings - Fork 653
Optimize wgrad CUTLASS grouped gemm #4891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82700455. |
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82700455. |
00028ba
to
78c96a2
Compare
Summary: Pull Request resolved: #4891 X-link: facebookresearch/FBGEMM#1916 - Make wgrad CUTLASS grouped gemm return float32 output when wgrad is provided, respecting e2e - Optimize general heuristic - Make tests cover wgrad accum with float32 output Reviewed By: q10 Differential Revision: D82700455
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82700455. |
Summary: Pull Request resolved: #4891 X-link: facebookresearch/FBGEMM#1916 - Make wgrad CUTLASS grouped gemm return float32 output when wgrad is provided, respecting e2e - Optimize general heuristic - Make tests cover wgrad accum with float32 output Reviewed By: q10 Differential Revision: D82700455
78c96a2
to
15fa951
Compare
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82700455. |
Summary: Pull Request resolved: #4891 X-link: facebookresearch/FBGEMM#1916 - Make wgrad CUTLASS grouped gemm return float32 output when wgrad is provided, respecting e2e - Optimize general heuristic - Make tests cover wgrad accum with float32 output Reviewed By: q10 Differential Revision: D82700455
15fa951
to
5477513
Compare
…resubmit) Differential Revision: D83001505
Differential Revision: D82700396
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82700455. |
Summary: Pull Request resolved: #4891 X-link: facebookresearch/FBGEMM#1916 - Make wgrad CUTLASS grouped gemm return float32 output when wgrad is provided, respecting e2e - Optimize general heuristic - Make tests cover wgrad accum with float32 output Reviewed By: q10 Differential Revision: D82700455
5477513
to
664d7a4
Compare
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82700455. |
Summary: Pull Request resolved: #4891 X-link: facebookresearch/FBGEMM#1916 - Make wgrad CUTLASS grouped gemm return float32 output when wgrad is provided, respecting e2e - Optimize general heuristic - Make tests cover wgrad accum with float32 output Reviewed By: q10 Differential Revision: D82700455
664d7a4
to
e5c1b8b
Compare
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82700455. |
Summary: Pull Request resolved: #4891 X-link: facebookresearch/FBGEMM#1916 - Make wgrad CUTLASS grouped gemm return float32 output when wgrad is provided, respecting e2e - Optimize general heuristic - Make tests cover wgrad accum with float32 output Reviewed By: q10 Differential Revision: D82700455
e5c1b8b
to
763d57a
Compare
Summary: Pull Request resolved: #4891 X-link: facebookresearch/FBGEMM#1916 - Make wgrad CUTLASS grouped gemm return float32 output when wgrad is provided, respecting e2e - Optimize general heuristic - Make tests cover wgrad accum with float32 output Reviewed By: q10 Differential Revision: D82700455
@jiawenliu64 has exported this pull request. If you are a Meta employee, you can view the originating diff in D82700455. |
763d57a
to
9cd2d85
Compare
This pull request has been merged in ddada9e. |
Summary:
X-link: https://github.com/facebookresearch/FBGEMM/pull/1916
Differential Revision: D82700455