Skip to content

Update supported cuda slot input. #8107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

heathen711
Copy link
Contributor

@heathen711 heathen711 commented Jun 14, 2025

Summary

Bring in support for defining a cuda gpu slot higher then 1.

Related Issues / Discussions

Closes #8102

QA Instructions

Ran the device pytest.

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions bot added python PRs that change python files services PRs that change app services frontend PRs that change frontend files python-tests PRs that change python tests labels Jun 14, 2025
@psychedelicious
Copy link
Collaborator

Great thanks. And just to confirm, does it work and let invoke run on the GPU you wanted it to?

@heathen711
Copy link
Contributor Author

Great thanks. And just to confirm, does it work and let invoke run on the GPU you wanted it to?

Sadly no, it's running on slot 1, I'm having a hard time tracing where the call to allocate is...

@heathen711 heathen711 marked this pull request as draft June 14, 2025 05:28
@heathen711 heathen711 marked this pull request as ready for review June 14, 2025 05:51
@heathen711
Copy link
Contributor Author

Great thanks. And just to confirm, does it work and let invoke run on the GPU you wanted it to?

I was wrong (to many generations running at the same time from my llm and comfyui confused me so I killed them all to confirm)

with cuda:3 generating an SDXL image I see:

========================================= ROCm System Management Interface =========================================
=================================================== Concise Info ===================================================
Device  Node  IDs              Temp    Power   Partitions          SCLK     MCLK   Fan  Perf  PwrCap  VRAM%  GPU%
              (DID,     GUID)  (Edge)  (Avg)   (Mem, Compute, ID)
====================================================================================================================
0       2     0x73a1,   28666  29.0°C  7.0W    N/A, N/A, 0         0Mhz     96Mhz  0%   auto  250.0W  34%    0%
1       3     0x73a1,   50690  27.0°C  7.0W    N/A, N/A, 0         0Mhz     96Mhz  0%   auto  250.0W  0%     0%
2       4     0x73a1,   34892  28.0°C  7.0W    N/A, N/A, 0         0Mhz     96Mhz  0%   auto  250.0W  0%     0%
3       5     0x73a1,   51870  44.0°C  127.0W  N/A, N/A, 0         2465Mhz  96Mhz  0%   auto  250.0W  26%    99%
====================================================================================================================
=============================================== End of ROCm SMI Log ================================================

@psychedelicious
Copy link
Collaborator

psychedelicious commented Jun 14, 2025

That looks good but I'm sus on the VRAM usage in slot 0. Maybe there's something in Invoke that is not respecting the GPU selection. Could you please restart the system and try a couple different models - maybe add some controlnets or ip adapters in - and confirm we only see VRAM on slot 3?

Edit: If it's not working, it'll be something we need to address separately from this PR - we can get this PR merged now.

@heathen711
Copy link
Contributor Author

That looks good but I'm sus on the VRAM usage in slot 0. Maybe there's something in Invoke that is not respecting the GPU selection. Could you please restart the system and try a couple different models - maybe add some controlnets or ip adapters in - and confirm we only see VRAM on slot 3?

Edit: If it's not working, it'll be something we need to address separately from this PR - we can get this PR merged now.

Gallery with a SDXL + lora + refiner + vae: Saw no usage outside of cuda:3

Canvas with a soft edge control + SDXL generation: saw no usage outside of cuda:3

@psychedelicious psychedelicious force-pushed the bugfix/heathen711/issue-8102 branch from d5f4722 to 37b98ee Compare June 16, 2025 09:23
@psychedelicious psychedelicious enabled auto-merge (rebase) June 16, 2025 09:23
@psychedelicious psychedelicious merged commit 4bfa643 into invoke-ai:main Jun 16, 2025
12 checks passed
@heathen711 heathen711 deleted the bugfix/heathen711/issue-8102 branch June 16, 2025 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
frontend PRs that change frontend files python PRs that change python files python-tests PRs that change python tests services PRs that change app services
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[bug]: device selection when you have more then 2 cuda devices
2 participants