Skip to content

Optimize wgrad CUTLASS grouped gemm #15412

Optimize wgrad CUTLASS grouped gemm

Optimize wgrad CUTLASS grouped gemm #15412

Triggered via pull request September 23, 2025 16:43
Status Cancelled
Total duration 56m 9s
Artifacts 9

fbgemm_gpu_ci_cuda.yml

on: pull_request
generate-build-matrix  /  generate_ci_matrix
10s
generate-build-matrix / generate_ci_matrix
Matrix: build / build_artifact
generate-test-matrix  /  generate_ci_matrix
generate-test-matrix / generate_ci_matrix
Matrix: test / test_and_publish_artifact
Waiting for pending jobs
Fit to window
Zoom out
Zoom in

Annotations

62 errors
build / build_artifact (3.12, x86, linux.12xlarge.memory, genai, gcc, 12.6.3)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.11, x86, linux.24xlarge, default, clang, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.12xlarge.memory, genai, clang, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.12xlarge.memory, genai, gcc, 13.0.0)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.11, x86, linux.12xlarge.memory, genai, clang, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.12xlarge.memory, genai, clang, 12.6.3)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.24xlarge, default, clang, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.12xlarge.memory, genai, clang, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.24xlarge, default, gcc, 12.6.3)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.11, x86, linux.12xlarge.memory, genai, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.24xlarge, default, gcc, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.12, x86, linux.12xlarge.memory, genai, clang, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.24xlarge, default, gcc, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.12xlarge.memory, genai, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.24xlarge, default, clang, 12.6.3)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.24xlarge, default, clang, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.12, x86, linux.24xlarge, default, clang, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.12, x86, linux.24xlarge, default, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.11, x86, linux.12xlarge.memory, genai, gcc, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.11, x86, linux.12xlarge.memory, genai, clang, 13.0.0)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.10, x86, linux.24xlarge, default, gcc, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.12xlarge.memory, genai, gcc, 13.0.0)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.24xlarge, default, clang, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.12, x86, linux.12xlarge.memory, genai, gcc, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.24xlarge, default, clang, 12.9.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.13, x86, linux.12xlarge.memory, genai, gcc, 12.6.3)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.24xlarge, default, gcc, 12.6.3)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.9, x86, linux.12xlarge.memory, genai, clang, 13.0.0)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.
build / build_artifact (3.11, x86, linux.24xlarge, default, gcc, 12.8.1)
The self-hosted runner lost communication with the server. Verify the machine is running and has a healthy network connection. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error.

Artifacts

Produced during runtime
Name Size Digest
fbgemm_default_x86_clang_py3.10_cu12.8.1.whl
837 MB
sha256:a569daef8922e0876b400a799ec5be0c6c7091a3b09dd5076556bc1feaebbaa9
fbgemm_default_x86_clang_py3.11_cu12.6.3.whl
535 MB
sha256:5c78a61ce72f1e9d82a2b455a197bb2fdb9011e076904acf37bd769a46574def
fbgemm_default_x86_clang_py3.12_cu12.6.3.whl
535 MB
sha256:cde1572e55e2114bcd10f5dbb07d6beb5d62ab3a26a4fbc26d3176a29c5deeed
fbgemm_default_x86_gcc_py3.12_cu12.8.1.whl
842 MB
sha256:cdad785da0aa899263f4e161b1e17ea4a779a43a2f182a07dce9c1e312de48d4
fbgemm_genai_x86_clang_py3.11_cu12.6.3.whl
16.2 MB
sha256:6827b6fae2e9332b30123699f3b998f5ae228c269d570e7c726847629f42ab6b
fbgemm_genai_x86_clang_py3.12_cu12.6.3.whl
16.2 MB
sha256:bd1d60e57878bf18e11eb6692c3f6fa792b074b1bb7d7f6ded034f02e5adfac2
fbgemm_genai_x86_gcc_py3.10_cu12.6.3.whl
16.6 MB
sha256:8b028a1380c77b9c44333eb8f77fd5a7a90cd842d896344afb0333b7f3050521
fbgemm_genai_x86_gcc_py3.11_cu12.6.3.whl
16.6 MB
sha256:a887738217e73896f21058ad474d8b7aa0f9d4be3a6403ad438acf2a3abaea1d
fbgemm_genai_x86_gcc_py3.13_cu12.6.3.whl
16.6 MB
sha256:ae0c5e106fe6f4c816012bf5faa4bf0780026775d43c1128403fe98cbc80a277