-
Notifications
You must be signed in to change notification settings - Fork 653
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add Static Dispatch Kernels (#4927)
cla signed
fb-exported
meta-exported
#4936
opened Sep 25, 2025 by
kqfu
Loading…
Back out "Update to use Python 3.9 syntax"
cla signed
fb-exported
meta-exported
#4928
opened Sep 24, 2025 by
q10
Loading…
forward performance tuning for MI350
cla signed
module: rocm
#4925
opened Sep 24, 2025 by
liligwu
Loading…
Add Inference Feature to Skip Pinned Memory Creation (#1948)
cla signed
fb-exported
meta-exported
#4924
opened Sep 23, 2025 by
Jason-KChen
Loading…
more hipify v2 fixes (#4854)
cla signed
fb-exported
meta-exported
module: rocm
#4921
opened Sep 23, 2025 by
q10
Loading…
Support bf16 in blackwell cutlass decode attention kernel
cla signed
fb-exported
meta-exported
#4916
opened Sep 23, 2025 by
Aya-ZIbra
Loading…
Resolve wgrad grouped gemm relocation issue in fbgemm
cla signed
fb-exported
meta-exported
#4915
opened Sep 23, 2025 by
jiawenliu64
Loading…
avx512 based int8 -> bf16 dequantization
cla signed
fb-exported
meta-exported
#4912
opened Sep 22, 2025 by
seanx92
Loading…
upgrade cutlass to 4.2.0 for fbcode
cla signed
fb-exported
meta-exported
#4893
opened Sep 18, 2025 by
henrylhtsang
Loading…
- Clean torch.check
cla signed
fb-exported
meta-exported
#4871
opened Sep 12, 2025 by
flaviotruzzi
Loading…
dequantize_fp8_cache_kernel: Move D=128 device-side-assertion check to host
cla signed
fb-exported
meta-exported
#4869
opened Sep 12, 2025 by
ColinPeppler
Loading…
symmetric quantization to FBGEMM prefill token-wise FP8 (fixed)
cla signed
fb-exported
meta-exported
#4868
opened Sep 12, 2025 by
ColinPeppler
Loading…
- Reland D75563906
ci-no-td
cla signed
fb-exported
meta-exported
#4865
opened Sep 11, 2025 by
flaviotruzzi
Loading…
Migrate GenAI quantize kernels to
FBGEMM_LAUNCH_KERNEL
, pt 4
cla signed
fb-exported
#4863
opened Sep 11, 2025 by
q10
Loading…
Add cutlass decode kernel to TritonBench
cla signed
fb-exported
meta-exported
#4853
opened Sep 10, 2025 by
Aya-ZIbra
Loading…
remove std out in EEG estimator
cla signed
fb-exported
#4832
opened Sep 6, 2025 by
YanXiong-Meta
Loading…
convert batch size to float before torch.std in params reporter
cla signed
fb-exported
#4828
opened Sep 5, 2025 by
YanXiong-Meta
Loading…
Migrate backward warp kernel arguments to use PTA_B
cla signed
fb-exported
#4825
opened Sep 5, 2025 by
q10
Loading…
Migrate TBE UVM cache kernels to
FBGEMM_LAUNCH_KERNEL
cla signed
fb-exported
#4817
opened Sep 4, 2025 by
q10
Loading…
remove cpu check and hardcoded row alignment to 8
cla signed
fb-exported
#4781
opened Aug 27, 2025 by
chenyuzhcy
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.