KubeAI: A K8s vLLM operator #7955
samos123
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
KubeAI: a vLLM K8s operator
KubeAI is the easiest way to deploy vLLM at scale on K8s. Some highlights:
✅️ Drop-in replacement for OpenAI with API compatibility
🚀 Works on CPUs and GPUs
⚖️ Scale from zero, autoscale based on load
🛠️ Zero dependencies (no Istio, Knative, etc.)
🤖 Operates OSS model servers (vLLM and Ollama)
🔋 Includes a Chat UI out of the box: (OpenWebUI i.e. ChatGPT-like UI)
✉️ Plug-n-play with cloud messaging systems (Kafka, PubSub, etc.)
Would love your feedback!
Source: https://github.com/substratusai/kubeai
Beta Was this translation helpful? Give feedback.
All reactions