Skip to content

Enable CUTLASS grouped GEMM for pretraining wgrad on GB200 and H100 (resubmit) #15407

Enable CUTLASS grouped GEMM for pretraining wgrad on GB200 and H100 (resubmit)

Enable CUTLASS grouped GEMM for pretraining wgrad on GB200 and H100 (resubmit) #15407

Triggered via pull request September 23, 2025 16:18
Status Cancelled
Total duration 1h 30m 40s
Artifacts 59

fbgemm_gpu_ci_cuda.yml

on: pull_request
generate-build-matrix  /  generate_ci_matrix
9s
generate-build-matrix / generate_ci_matrix
Matrix: build / build_artifact
generate-test-matrix  /  generate_ci_matrix
generate-test-matrix / generate_ci_matrix
Matrix: test / test_and_publish_artifact
Waiting for pending jobs
Fit to window
Zoom out
Zoom in

Annotations

17 errors
build / build_artifact (3.10, x86, linux.12xlarge.memory, genai, gcc, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.12, x86, linux.24xlarge, default, clang, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.10, x86, linux.24xlarge, default, clang, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.12xlarge.memory, genai, gcc, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.24xlarge, default, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.11, x86, linux.24xlarge, default, clang, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.12xlarge.memory, genai, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.12xlarge.memory, genai, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.10, x86, linux.12xlarge.memory, genai, clang, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.10, x86, linux.24xlarge, default, gcc, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.12, x86, linux.12xlarge.memory, genai, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.12, x86, linux.24xlarge, default, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

Artifacts

Produced during runtime
Name Size Digest
fbgemm_default_x86_clang_py3.10_cu12.6.3.whl
535 MB
sha256:c8dfcd8c57df0f130c259634913705a6c43b2f04ec21bd3ce9774f6d2efb0622
fbgemm_default_x86_clang_py3.10_cu12.8.1.whl
837 MB
sha256:fa051fea012629a7cfc5777979740d06263086905ae5e489aa6e5a6e06827273
fbgemm_default_x86_clang_py3.10_cu12.9.1.whl
892 MB
sha256:65bb6447515669f5a4fe3aebf8fd4ed52b320641929888d5327f3782d92061c9
fbgemm_default_x86_clang_py3.11_cu12.6.3.whl
535 MB
sha256:e1255bbe4d2a49d6d5b9734e80b3ff648875b93a86703beb964f6970b60fbb3c
fbgemm_default_x86_clang_py3.11_cu12.8.1.whl
837 MB
sha256:43bfe3f7f62bd83af9795ae6c0c09370e13cc153554dee0d1d5bded9badc1045
fbgemm_default_x86_clang_py3.11_cu12.9.1.whl
892 MB
sha256:2255a6e432dd3a5e6bc6dd8dcf96bcce0d5c6e26be966688079c98f9fa174145
fbgemm_default_x86_clang_py3.12_cu12.6.3.whl
535 MB
sha256:23d73cc1e5ce02c902c5f3083f63f9ab5d9b9e38f66b24813c310c81b4c29552
fbgemm_default_x86_clang_py3.12_cu12.8.1.whl
837 MB
sha256:98877ebc7fcfd015b8fde461323bf126ecb8c138b4af4930ee4231d088082c8f
fbgemm_default_x86_clang_py3.12_cu12.9.1.whl
892 MB
sha256:b32c247fe394469967f572f1ea2b37403b92b3a517bd125170a6a6c83a100cc0
fbgemm_default_x86_clang_py3.13_cu12.6.3.whl
535 MB
sha256:83c418815b4bbeb8a16a4bb0d3684915f67fe5ef1eef40981427af4e0ac159c3
fbgemm_default_x86_clang_py3.13_cu12.8.1.whl
837 MB
sha256:a50a7b7fe37228d91c69a65d8ba95d2e62ca0b5e037fd9ee03aa4280f0b126e0
fbgemm_default_x86_clang_py3.13_cu12.9.1.whl
892 MB
sha256:909e94a6a56847481f5c967c596f311c8aad487b993520ea31328c4fc5db5197
fbgemm_default_x86_clang_py3.9_cu12.6.3.whl
534 MB
sha256:c153dd44d29a93b641f9d4c5efac1f569d99748f8606c5ba5f902b1028955de0
fbgemm_default_x86_clang_py3.9_cu12.8.1.whl
835 MB
sha256:9f232644b44892e6361fbec6fc552d0f2f40f2092f9ae104a93bee2fa535de2e
fbgemm_default_x86_clang_py3.9_cu12.9.1.whl
892 MB
sha256:ee478f577c2ec5e12f83f586019d8431a78df8b37d74dd663ee989194e075582
fbgemm_default_x86_gcc_py3.10_cu12.6.3.whl
542 MB
sha256:cd2c3923a667f1ca1efc2788ffa64f9c4f0ad62d365c06be718e60fd81f7d600
fbgemm_default_x86_gcc_py3.10_cu12.8.1.whl
842 MB
sha256:8ef01f806c0c94e0ebdd17e2dab10fe02efecb8ebfbd17446922bdc7bb54d547
fbgemm_default_x86_gcc_py3.10_cu12.9.1.whl
899 MB
sha256:2930611e2f02edc244e1d3f42943789869b9b3042eae2cf373f05b040460d50d
fbgemm_default_x86_gcc_py3.11_cu12.6.3.whl
542 MB
sha256:2f1e740f0d79078a7f8fcdf832528d4fabdffc640b708e3cb352dc0f05d1bf9f
fbgemm_default_x86_gcc_py3.11_cu12.8.1.whl
842 MB
sha256:70b6420cc6f5ba8b3a2fee90d831e4189b3bde9d75637499da355e14f5038a57
fbgemm_default_x86_gcc_py3.11_cu12.9.1.whl
899 MB
sha256:56e36944270da255be5d84b5a763f1b4f5de91372d1d38089f3f69500397c591
fbgemm_default_x86_gcc_py3.12_cu12.6.3.whl
542 MB
sha256:9741e1784fc434e53589979eb7722bbd141a22596cbf4704386a7a1396e579b5
fbgemm_default_x86_gcc_py3.12_cu12.8.1.whl
842 MB
sha256:409537229f8c07bc5fb44738aaa50d8475b138413d11e423f3c063fa692b839c
fbgemm_default_x86_gcc_py3.12_cu12.9.1.whl
899 MB
sha256:1838a459a8a04d3cb386b5595bfff9f6fce1eb580fe02a134cd668ce2bafcd3b
fbgemm_default_x86_gcc_py3.13_cu12.6.3.whl
542 MB
sha256:07d4dc51ac8c2b7b997213f73791722acda1678727dbdfc4d91f446ee7d0885d
fbgemm_default_x86_gcc_py3.13_cu12.8.1.whl
842 MB
sha256:2088e929a1159ce23b763e1fc216bface56072da06c0e736f3987b22756e8f6d
fbgemm_default_x86_gcc_py3.13_cu12.9.1.whl
899 MB
sha256:d75bbac37d066c6d01534bf80bfc9ff8788210914db4d386ff01164eea4f3501
fbgemm_default_x86_gcc_py3.9_cu12.6.3.whl
542 MB
sha256:d60eb8cfc216e6904eec346fafde7caa93a5b3583379ae11530bee4ae928c90d
fbgemm_default_x86_gcc_py3.9_cu12.8.1.whl
842 MB
sha256:5472c9283d6a611c93bfd02e922dce6da214a588667dacddc62d729b356f4556
fbgemm_default_x86_gcc_py3.9_cu12.9.1.whl
899 MB
sha256:74d629282f1298616dfd2ea8c913f72ac0e3f2a5b091501c737c9d10d392fc05
fbgemm_genai_x86_clang_py3.10_cu12.6.3.whl
15.9 MB
sha256:b7180e637ee1b7d3bf194229b9edd9f6258fa3447795afc12fd1f94d9766e7e5
fbgemm_genai_x86_clang_py3.10_cu12.8.1.whl
46.5 MB
sha256:5f294d302fcf3a11511b9843bb64c39e09b26ee51c4f9e5d25e6544bf3ed3e5b
fbgemm_genai_x86_clang_py3.10_cu13.0.0.whl
44 MB
sha256:d58106f7f5fd2f5223ec8841ecf1e6df8710e029db718a469ab767e54ee2e9ef
fbgemm_genai_x86_clang_py3.11_cu12.6.3.whl
15.9 MB
sha256:65d1340ed15138bdd802948d977f555f8d29455df2a4b09f88d54d0ae301e88b
fbgemm_genai_x86_clang_py3.11_cu12.8.1.whl
46.5 MB
sha256:4f41e43c1f1c2fcc479d4ec78d40afe1154af4b204ce446b003066ad219ca3bc
fbgemm_genai_x86_clang_py3.11_cu13.0.0.whl
44 MB
sha256:ea154cdafd69ef7a066978ece86da4749184fd9e76e66a51d77757e0ad5423f6
fbgemm_genai_x86_clang_py3.12_cu12.6.3.whl
15.9 MB
sha256:ee8278be8a4aed182d9db76b31507a647734d2ae9a6a7ae06bbbe91d4e5bcff7
fbgemm_genai_x86_clang_py3.12_cu12.8.1.whl
46.5 MB
sha256:15fba7d22bc47dac1e68f7dda5e6279da46419c23da170c2cc93abacc7ff4787
fbgemm_genai_x86_clang_py3.12_cu13.0.0.whl
44 MB
sha256:ae6fcddf816902ea55303bc73c0c52407547b1773ac302719ee94b1c04c373f9
fbgemm_genai_x86_clang_py3.13_cu12.6.3.whl
15.9 MB
sha256:4e571afef609badb2a0e7ec46e43bb5e0d9f8d52b98d936d7dbe3f0b5f240b6c
fbgemm_genai_x86_clang_py3.13_cu12.8.1.whl
46.5 MB
sha256:a789987abefdb56593210c02af6a6b121d31fc8ab12d242c8efc9b7fb9f391a5
fbgemm_genai_x86_clang_py3.13_cu12.9.1.whl
48.1 MB
sha256:8f28e371b8f15b0de90a22bd6948520deaa59c5eb0b341ce1b2fd979c7fd1bd1
fbgemm_genai_x86_clang_py3.13_cu13.0.0.whl
44 MB
sha256:4fbf183948a40fd8c1178158cb7f4a72fb6fe901d4e92d06977f0e4a0afa63ec
fbgemm_genai_x86_clang_py3.9_cu12.6.3.whl
15.8 MB
sha256:0aa528eea5d6cb8afeb04a833efae31a7f5c46dd8d20507f828d00325f01d68f
fbgemm_genai_x86_clang_py3.9_cu12.8.1.whl
46.5 MB
sha256:bf57d4deca5e128d899ec01bf2ff3240a25bbb02fb4e80b890b0c10d642dabb7
fbgemm_genai_x86_clang_py3.9_cu13.0.0.whl
44 MB
sha256:ad3d938b74e79e1091751f413eedf93df2142e5b9ed91c23b6b6220a24e228e5
fbgemm_genai_x86_gcc_py3.10_cu12.6.3.whl
16.2 MB
sha256:fdbf0b6c4553a18535a269b6aac6340af6d9516c6b2d35fa4ac9dfeae6a72071
fbgemm_genai_x86_gcc_py3.10_cu12.9.1.whl
48.8 MB
sha256:73589f84c0ee2df19a9849f99c2d745d8484c1482ad7c25b6d80835b77a4c4ca
fbgemm_genai_x86_gcc_py3.10_cu13.0.0.whl
45.2 MB
sha256:602393857bb91bd393f93ed5994d046b4df63ec815cdbf8d9350e1d1ad05003e
fbgemm_genai_x86_gcc_py3.11_cu12.6.3.whl
16.2 MB
sha256:cdedd4fe62142f41ec0b94c2227fca3c7dc9fdaf583966dc9a06cf54928fa3a8
fbgemm_genai_x86_gcc_py3.11_cu12.8.1.whl
47.3 MB
sha256:7de3a082d2e15c991c4ea017a9b46474ea55a091d023d4ee257fc8e81eabe7df
fbgemm_genai_x86_gcc_py3.11_cu13.0.0.whl
45.1 MB
sha256:15502bfec54f6125f0baad5113152cf20416e5ffb727d42dc4a5e57c9f7a6466
fbgemm_genai_x86_gcc_py3.12_cu12.6.3.whl
16.2 MB
sha256:35ab4a7044d49650f9c7f1c533a801eb8d8033dff70bb5690e00e884cba0e90b
fbgemm_genai_x86_gcc_py3.12_cu12.8.1.whl
47.3 MB
sha256:ca40bc0319664891a53b96929917313fb9e94c57916895f8cdef93bb5c5e9ce8
fbgemm_genai_x86_gcc_py3.13_cu12.6.3.whl
16.2 MB
sha256:0b2149d3aeb440ff7b2f0d7645b27b2640ac802cd32db8c606627f41d5e8d58e
fbgemm_genai_x86_gcc_py3.13_cu13.0.0.whl
45.1 MB
sha256:82ac059b9c62eb9b0afe7be5a2fe65e9c337e8e655ba4bcc38a54b8e3de93e6a
fbgemm_genai_x86_gcc_py3.9_cu12.6.3.whl
16.2 MB
sha256:31da035fa2f504995e424a38eefa13b72a9f1262cc1832c3d9547d201199b0ee
fbgemm_genai_x86_gcc_py3.9_cu12.8.1.whl
47.2 MB
sha256:da3a3acb743c13952a6b495202bbd29887454de9a43833f73203b87cb32efba2
fbgemm_genai_x86_gcc_py3.9_cu13.0.0.whl
44.9 MB
sha256:2626ddb7c31da92cece9f83833bd0e4dd10b5e52a21cab9b4219f7a1e6251159