resolve multi-CUDA_ARCHITECTURES compilation conflicts #241

GACLove · 2025-08-14T10:18:40Z

fix(build): resolve multi-CUDA_ARCHITECTURES compilation conflicts

Isolate build artifacts per CUDA architecture in build/sm_{arch}/ subdirectories
Add validation for CUDA_ARCHITECTURES format (comma-separated decimals)

Copilot

Pull Request Overview

This PR resolves CUDA compilation conflicts when building for multiple GPU architectures by isolating build artifacts and adding validation for CUDA_ARCHITECTURES environment variable.

Enables building for multiple CUDA architectures by reading from CUDA_ARCHITECTURES environment variable
Isolates build artifacts per architecture in separate subdirectories to prevent conflicts
Filters NVCC flags to only include relevant architecture-specific flags for each extension

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-08-14T17:23:55Z

setup.py

+if cuda_architectures is not None:
+    for arch in cuda_architectures.split(","):
+        arch = arch.strip()
+        if arch:


The code doesn't validate that each architecture value is a valid decimal format as mentioned in the PR description. Consider adding validation to ensure each arch value matches the expected pattern (e.g., regex check for decimal format).

Suggested change

if arch:

if arch:

if not re.match(r"^\d+\.\d+$", arch):

raise ValueError(f"Invalid architecture value '{arch}' in CUDA_ARCHITECTURES. Expected decimal format like '8.0', '8.6', etc.")

Copilot · 2025-08-14T17:23:56Z

setup.py

+                skip_next = True
+        elif flag not in ["-gencode"]:
+            filtered_flags.append(flag)
+


The flag filtering logic is duplicated between sm89 and sm90 extensions. Consider extracting this into a helper function to reduce code duplication and improve maintainability.

Suggested change

sm89_arch_list = ["sm_89", "compute_89", "sm_90a", "compute_90a", "sm_120", "compute_120"]

filtered_flags = filter_nvcc_flags(NVCC_FLAGS, sm89_arch_list)

Copilot · 2025-08-14T17:23:56Z

setup.py

+                    filtered_flags.append(arch_flag)
+                skip_next = True
+        elif flag not in ["-gencode"]:
+            filtered_flags.append(flag)


This is duplicate code from the sm89 extension filtering logic. The same flag filtering pattern should be extracted into a reusable function.

Suggested change

filtered_flags.append(flag)

def filter_nvcc_flags_for_arch(nvcc_flags, arch_substrings):

filtered_flags = []

skip_next = False

for i, flag in enumerate(nvcc_flags):

if skip_next:

skip_next = False

continue

if flag == "-gencode":

if i + 1 < len(nvcc_flags):

arch_flag = nvcc_flags[i + 1]

if any(sub in arch_flag for sub in arch_substrings):

filtered_flags.append(flag)

filtered_flags.append(arch_flag)

skip_next = True

elif flag not in ["-gencode"]:

filtered_flags.append(flag)

return filtered_flags

filtered_flags = filter_nvcc_flags_for_arch(NVCC_FLAGS, ["sm_90a", "compute_90a"])

Copilot · 2025-08-14T17:23:56Z

setup.py

@@ -208,24 +259,32 @@ def compile_new(*args, **kwargs):
                        **kwargs,
                        "output_dir": os.path.join(
                            kwargs["output_dir"],
-                            self.thread_ext_name_map[threading.current_thread().ident]),
+                            self.thread_ext_name_map.get(threading.current_thread().ident, "default")),


Using 'default' as a fallback directory name could lead to conflicts if multiple threads don't have mapped extension names. Consider using a more unique identifier like thread ID or timestamp.

Suggested change

self.thread_ext_name_map.get(threading.current_thread().ident, "default")),

self.thread_ext_name_map.get(

threading.current_thread().ident,

f"thread_{threading.current_thread().ident}"

)),

GACLove · 2025-08-14T17:24:04Z

testing

pip install -v --no-cache-dir .
CUDA_ARCHITECTURES="9.0,12.0" pip install -v --no-cache-dir .
CUDA_ARCHITECTURES="8.9,9.0" pip install -v --no-cache-dir .

…d architectures via environment variable. Refactor GPU capability checks and streamline NVCC flags for SM89 and SM90 extensions. Improve build process by creating separate output directories for extensions.

GACLove force-pushed the fix_build branch from a8daff3 to b3dc8b1 Compare August 14, 2025 17:19

GACLove marked this pull request as ready for review August 14, 2025 17:22

Copilot AI review requested due to automatic review settings August 14, 2025 17:22

Copilot AI reviewed Aug 14, 2025

View reviewed changes

Enhance CUDA architecture support in setup.py by allowing user-define…

5d2ef2f

…d architectures via environment variable. Refactor GPU capability checks and streamline NVCC flags for SM89 and SM90 extensions. Improve build process by creating separate output directories for extensions.

GACLove force-pushed the fix_build branch from b3dc8b1 to 5d2ef2f Compare August 15, 2025 02:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

resolve multi-CUDA_ARCHITECTURES compilation conflicts #241

resolve multi-CUDA_ARCHITECTURES compilation conflicts #241

Uh oh!

GACLove commented Aug 14, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 14, 2025

Uh oh!

Copilot AI Aug 14, 2025

Uh oh!

Copilot AI Aug 14, 2025

Uh oh!

Copilot AI Aug 14, 2025

Uh oh!

GACLove commented Aug 14, 2025

Uh oh!

Uh oh!


	sm89_arch_list = ["sm_89", "compute_89", "sm_90a", "compute_90a", "sm_120", "compute_120"]
	filtered_flags = filter_nvcc_flags(NVCC_FLAGS, sm89_arch_list)

-            filtered_flags.append(flag)
+    def filter_nvcc_flags_for_arch(nvcc_flags, arch_substrings):
+        filtered_flags = []
+        skip_next = False
+        for i, flag in enumerate(nvcc_flags):
+            if skip_next:
+                skip_next = False
+                continue
+            if flag == "-gencode":
+                if i + 1 < len(nvcc_flags):
+                    arch_flag = nvcc_flags[i + 1]
+                    if any(sub in arch_flag for sub in arch_substrings):
+                        filtered_flags.append(flag)
+                        filtered_flags.append(arch_flag)
+                    skip_next = True
+            elif flag not in ["-gencode"]:
+                filtered_flags.append(flag)
+        return filtered_flags
+    filtered_flags = filter_nvcc_flags_for_arch(NVCC_FLAGS, ["sm_90a", "compute_90a"])

resolve multi-CUDA_ARCHITECTURES compilation conflicts #241

Are you sure you want to change the base?

resolve multi-CUDA_ARCHITECTURES compilation conflicts #241

Uh oh!

Conversation

GACLove commented Aug 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

GACLove commented Aug 14, 2025

Uh oh!

Uh oh!