Skip to content

[Subproject Application]: CNCF Batch & HPC System Initiative #1707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
asm582 opened this issue May 17, 2025 · 0 comments
Open

[Subproject Application]: CNCF Batch & HPC System Initiative #1707

asm582 opened this issue May 17, 2025 · 0 comments

Comments

@asm582
Copy link

asm582 commented May 17, 2025

Subproject Name

HPC with Kubernetes

Parent Group

TAG Workloads Foundation

Mission/Purpose

Innovate with and around Kubernetes to enable running of batch AI/Inference/HPC workloads in the cloud

Scope

This Subproject will involve improving data-aware scheduling in Kubernetes, benchmarking different workloads with existing and new scheduling mechanisms, and creating well-defined definitions that can be consumed by the HPC and Kubernetes communities.

Goals and Objectives

We will work on multiple initiatives to make batch-type workloads work more natively in the cloud. These include the following already-started interest groups: data-aware scheduling with API spec, benchmarking AI systems in the cloud, user stories around batch workloads in the cloud, and clear definitions around this space.

Proposed Leads

Alex Scammon, Marlow Warnicke, and Abhishek Malvankar

Initial Participants/Contributors

G-Research, SchedMD, IBM Research ....

Benefits to CNCF

Batch workloads are becoming increasingly important, especially around training and multi-node inference.This group is a cohesive group of people trying to make these work more natively in the cloud.

Alignment with Parent Group

We handle high-performance workloads and all the projects therein.

Proposed Communication Channels

No response

Proposed Initial Deliverables (if applicable)

No response

Potential Future Work/Evolution

No response

Sponsoring CNCF Project or TOC Member (if applicable)

No response

Additional Information

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

No branches or pull requests

1 participant