You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Innovate with and around Kubernetes to enable running of batch AI/Inference/HPC workloads in the cloud
Scope
This Subproject will involve improving data-aware scheduling in Kubernetes, benchmarking different workloads with existing and new scheduling mechanisms, and creating well-defined definitions that can be consumed by the HPC and Kubernetes communities.
Goals and Objectives
We will work on multiple initiatives to make batch-type workloads work more natively in the cloud. These include the following already-started interest groups: data-aware scheduling with API spec, benchmarking AI systems in the cloud, user stories around batch workloads in the cloud, and clear definitions around this space.
Proposed Leads
Alex Scammon, Marlow Warnicke, and Abhishek Malvankar
Initial Participants/Contributors
G-Research, SchedMD, IBM Research ....
Benefits to CNCF
Batch workloads are becoming increasingly important, especially around training and multi-node inference.This group is a cohesive group of people trying to make these work more natively in the cloud.
Alignment with Parent Group
We handle high-performance workloads and all the projects therein.
Proposed Communication Channels
No response
Proposed Initial Deliverables (if applicable)
No response
Potential Future Work/Evolution
No response
Sponsoring CNCF Project or TOC Member (if applicable)
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered:
Subproject Name
HPC with Kubernetes
Parent Group
TAG Workloads Foundation
Mission/Purpose
Innovate with and around Kubernetes to enable running of batch AI/Inference/HPC workloads in the cloud
Scope
This Subproject will involve improving data-aware scheduling in Kubernetes, benchmarking different workloads with existing and new scheduling mechanisms, and creating well-defined definitions that can be consumed by the HPC and Kubernetes communities.
Goals and Objectives
We will work on multiple initiatives to make batch-type workloads work more natively in the cloud. These include the following already-started interest groups: data-aware scheduling with API spec, benchmarking AI systems in the cloud, user stories around batch workloads in the cloud, and clear definitions around this space.
Proposed Leads
Alex Scammon, Marlow Warnicke, and Abhishek Malvankar
Initial Participants/Contributors
G-Research, SchedMD, IBM Research ....
Benefits to CNCF
Batch workloads are becoming increasingly important, especially around training and multi-node inference.This group is a cohesive group of people trying to make these work more natively in the cloud.
Alignment with Parent Group
We handle high-performance workloads and all the projects therein.
Proposed Communication Channels
No response
Proposed Initial Deliverables (if applicable)
No response
Potential Future Work/Evolution
No response
Sponsoring CNCF Project or TOC Member (if applicable)
No response
Additional Information
No response
The text was updated successfully, but these errors were encountered: