Skip to content

[UX][k8s] show-gpus for all allowed contexts #5362

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kyuds
Copy link
Collaborator

@kyuds kyuds commented Apr 25, 2025

Fixes #5313

Screenshot 2025-04-25 at 11 31 03 AM

IMPORTANT! Feedback Please
I didn't know whether it would be desirable, but I think it would be a good idea to show a final table that would merge all available accelerators across all k8s clusters? The user might want to only know the total availability across clusters. I think that might be a helpful summary feature. For instance, for the above image, we would have:

Kubernetes GPUs (all allowed contexts):
GPU: T4, TOTAL_GPUS: 2, TOTAL_FREE_GPUS: 2

Some Behavioral Changes:

  • specifying an accelerator and the kubernetes context (aka region) at the same time ignored the kubernetes context. Fixed that to take into account the k8s context.
  • in kubernetes_catalog. _list_accelerators(), the function was adding accelerator names to the dict and setting its availability to 0. I don't know if this is a real requirement (feedback would be great), but this creates an assertion error when checking the keyset sizes for accelerator counts, capacity, and available dicts when used in conjunction with a quantity filter. Therefore, added in checks to NOT add accelerator names to the dict when capacity is 0.

Tested (run the relevant ones):

  • Code formatting: install pre-commit (auto-check on commit) or bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: /smoke-test (CI) or pytest tests/test_smoke.py (local)
  • Relevant individual tests: /smoke-test -k test_name (CI) or pytest tests/test_smoke.py::test_name (local)
  • Backward compatibility: /quicktest-core (CI) or pytest tests/smoke_tests/test_backward_compat.py (local)

@kyuds kyuds marked this pull request as draft April 25, 2025 02:45
@kyuds kyuds marked this pull request as ready for review April 25, 2025 04:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[UX] sky show-gpus only show GPUs in one k8s cluster instead of all specified in allowed_contexts
1 participant