-
Notifications
You must be signed in to change notification settings - Fork 2k
feat(infra): Adding new KEDA Helm templates #5289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
{{- if and .Values.keda.enabled .Values.keda.apiServer .Values.keda.apiServer.enabled }} | ||
apiVersion: keda.sh/v1alpha1 | ||
kind: ScaledObject | ||
metadata: | ||
name: {{ include "onyx-stack.fullname" . }}-api-server-scaledobject | ||
namespace: {{ .Release.Namespace }} | ||
labels: | ||
{{- include "onyx-stack.labels" . | nindent 4 }} | ||
app: api-server | ||
spec: | ||
scaleTargetRef: | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: {{ include "onyx-stack.fullname" . }} | ||
pollingInterval: {{ .Values.keda.apiServer.pollingInterval | default 30 }} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using default here overrides an explicit value of 0 for pollingInterval by falling back to 30; prefer preserving user-provided zero values. Prompt for AI agents
|
||
cooldownPeriod: {{ .Values.keda.apiServer.cooldownPeriod | default 300 }} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using default here overrides an explicit value of 0 for cooldownPeriod, changing semantics (e.g., immediate scale-down) by forcing 300 instead. Prompt for AI agents
|
||
minReplicaCount: {{ .Values.keda.apiServer.minReplicas | default 1 }} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using default here prevents minReplicaCount from being set to 0 (scale-to-zero), because 0 is considered empty and falls back to 1. Prefer checking key presence to preserve 0. Prompt for AI agents
|
||
maxReplicaCount: {{ .Values.keda.apiServer.maxReplicas | default 10 }} | ||
# Use HPA mode to generate an HPA that works alongside existing HPA infrastructure | ||
hpaName: {{ include "onyx-stack.fullname" . }}-api-server-keda-hpa | ||
triggers: | ||
- type: cpu | ||
metadata: | ||
type: Utilization | ||
value: {{ .Values.keda.apiServer.cpuThreshold | default "70" | quote }} | ||
{{- if .Values.keda.apiServer.memoryThreshold }} | ||
- type: memory | ||
metadata: | ||
type: Utilization | ||
value: {{ .Values.keda.apiServer.memoryThreshold | quote }} | ||
{{- end }} | ||
{{- end }} |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,49 @@ | ||||||
{{- if and .Values.keda.enabled .Values.keda.celeryWorkers.enabled }} | ||||||
{{- range $workerType, $workerConfig := .Values.keda.celeryWorkers }} | ||||||
{{- if and (ne $workerType "enabled") $workerConfig.enabled (ne $workerType "docprocessing") (ne $workerType "docfetching") }} | ||||||
--- | ||||||
apiVersion: keda.sh/v1alpha1 | ||||||
kind: ScaledObject | ||||||
metadata: | ||||||
name: {{ include "onyx-stack.fullname" $ }}-celery-worker-{{ $workerType }}-scaledobject | ||||||
namespace: {{ $.Release.Namespace }} | ||||||
labels: | ||||||
{{- include "onyx-stack.labels" $ | nindent 4 }} | ||||||
app: celery-worker-{{ $workerType }} | ||||||
spec: | ||||||
scaleTargetRef: | ||||||
apiVersion: apps/v1 | ||||||
kind: Deployment | ||||||
name: {{ include "onyx-stack.fullname" $ }}-celery-worker-{{ $workerType }} | ||||||
pollingInterval: {{ $workerConfig.pollingInterval | default 30 }} | ||||||
cooldownPeriod: {{ $workerConfig.cooldownPeriod | default 300 }} | ||||||
minReplicaCount: {{ $workerConfig.minReplicas | default 1 }} | ||||||
maxReplicaCount: {{ $workerConfig.maxReplicas | default 10 }} | ||||||
triggers: | ||||||
# Default Prometheus-based trigger for Redis queue depth | ||||||
# Scaling Logic: | ||||||
# - When queue depth > 5: Scale up by factor of 2 (moderate scaling) | ||||||
# - When queue depth <= 5: Scale down by factor of 0.5 (conservative scaling) | ||||||
# - Threshold of 1 ensures scaling triggers when metric value > 1 | ||||||
- type: prometheus | ||||||
metadata: | ||||||
serverAddress: "http://prometheus-redis.monitoring.svc.cluster.local:9090" | ||||||
metricName: "redis_key_size_sum" | ||||||
metricType: "Value" | ||||||
threshold: "1" | ||||||
query: | | ||||||
# Simplified scaling logic for generic celery workers | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This Prompt for AI agents
Suggested change
|
||||||
# Returns 2 when queue depth > 5, 0.5 when <= 5 | ||||||
# This creates a clear scaling decision boundary | ||||||
( | ||||||
(sum(redis_key_size{key=~"connector_{{ $workerType }}.*"}) > 5) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PromQL comparison lacks Prompt for AI agents
|
||||||
* 2 | ||||||
) | ||||||
+ | ||||||
( | ||||||
(sum(redis_key_size{key=~"connector_{{ $workerType }}.*"}) <= 5) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add Prompt for AI agents
|
||||||
* 0.5 | ||||||
) | ||||||
{{- end }} | ||||||
{{- end }} | ||||||
{{- end }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
{{- if and .Values.keda.enabled .Values.keda.celeryWorkers.docfetching .Values.keda.celeryWorkers.docfetching.enabled }} | ||
apiVersion: keda.sh/v1alpha1 | ||
kind: ScaledObject | ||
metadata: | ||
name: {{ include "onyx-stack.fullname" . }}-celery-worker-docfetching-scaledobject | ||
namespace: {{ .Release.Namespace }} | ||
labels: | ||
{{- include "onyx-stack.labels" . | nindent 4 }} | ||
app: celery-worker-docfetching | ||
spec: | ||
scaleTargetRef: | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: {{ include "onyx-stack.fullname" . }}-celery-worker-docfetching | ||
pollingInterval: {{ .Values.keda.celeryWorkers.docfetching.pollingInterval | default 30 }} | ||
cooldownPeriod: {{ .Values.keda.celeryWorkers.docfetching.cooldownPeriod | default 300 }} | ||
minReplicaCount: {{ .Values.keda.celeryWorkers.docfetching.minReplicas | default 1 }} | ||
maxReplicaCount: {{ .Values.keda.celeryWorkers.docfetching.maxReplicas | default 10 }} | ||
triggers: | ||
# Default Prometheus-based trigger for Redis queue depth | ||
# Scaling Logic: | ||
# - When queue depth > 5: Scale up by factor of 2 (aggressive scaling) | ||
# - When queue depth <= 5: Scale down by factor of 0.5 (conservative scaling) | ||
# - Threshold of 1 ensures scaling triggers when metric value > 1 | ||
- type: prometheus | ||
metadata: | ||
serverAddress: "http://prometheus-redis.monitoring.svc.cluster.local:9090" | ||
metricName: "redis_key_size_sum" | ||
metricType: "Value" | ||
threshold: "1" | ||
query: | | ||
# Simplified scaling logic for docfetching workers | ||
# Returns 2 when queue depth > 5, 0.5 when <= 5 | ||
# This creates a clear scaling decision boundary | ||
( | ||
(sum(redis_key_size{key=~"connector_docfetching.*"}) > 5) | ||
* 2 | ||
) | ||
+ | ||
( | ||
(sum(redis_key_size{key=~"connector_docfetching.*"}) <= 5) | ||
* 0.5 | ||
) | ||
{{- end }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
{{- if and .Values.keda.enabled .Values.keda.celeryWorkers.docprocessing .Values.keda.celeryWorkers.docprocessing.enabled }} | ||
apiVersion: keda.sh/v1alpha1 | ||
kind: ScaledObject | ||
metadata: | ||
name: {{ include "onyx-stack.fullname" . }}-celery-worker-docprocessing-scaledobject | ||
namespace: {{ .Release.Namespace }} | ||
labels: | ||
{{- include "onyx-stack.labels" . | nindent 4 }} | ||
app: celery-worker-docprocessing | ||
spec: | ||
scaleTargetRef: | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: {{ include "onyx-stack.fullname" . }}-celery-worker-docprocessing | ||
pollingInterval: {{ .Values.keda.celeryWorkers.docprocessing.pollingInterval | default 30 }} | ||
cooldownPeriod: {{ .Values.keda.celeryWorkers.docprocessing.cooldownPeriod | default 300 }} | ||
minReplicaCount: {{ .Values.keda.celeryWorkers.docprocessing.minReplicas | default 1 }} | ||
maxReplicaCount: {{ .Values.keda.celeryWorkers.docprocessing.maxReplicas | default 50 }} | ||
triggers: | ||
# Default Prometheus-based trigger for Redis queue depth | ||
# Scaling Logic: | ||
# - When queue depth > 20: Scale up by factor of 4 (very aggressive scaling) | ||
# - When queue depth <= 20: Scale down by factor of 0.25 (very conservative scaling) | ||
# - Threshold of 1 ensures scaling triggers when metric value > 1 | ||
- type: prometheus | ||
metadata: | ||
serverAddress: "http://prometheus-redis.monitoring.svc.cluster.local:9090" | ||
metricName: "redis_key_size_sum" | ||
metricType: "Value" | ||
threshold: "1" | ||
query: | | ||
# Simplified scaling logic for docprocessing workers | ||
# Returns 4 when queue depth > 20, 0.25 when <= 20 | ||
# This creates a clear scaling decision boundary for high-volume processing | ||
( | ||
(sum(redis_key_size{key=~"connector_docprocessing.*"}) > 20) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. PromQL comparison lacks 'bool', returning the metric value instead of 1/0; this makes the query scale by 4×queue depth rather than a constant factor. Prompt for AI agents
|
||
* 4 | ||
) | ||
+ | ||
( | ||
(sum(redis_key_size{key=~"connector_docprocessing.*"}) <= 20) | ||
* 0.25 | ||
) | ||
{{- end }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
{{- if and .Values.keda.enabled .Values.keda.modelServers.enabled }} | ||
{{- range $serverType, $serverConfig := .Values.keda.modelServers }} | ||
{{- if and (ne $serverType "enabled") $serverConfig.enabled }} | ||
--- | ||
apiVersion: keda.sh/v1alpha1 | ||
kind: ScaledObject | ||
metadata: | ||
name: {{ include "onyx-stack.fullname" $ }}-{{ $serverType }}-model-server-scaledobject | ||
namespace: {{ $.Release.Namespace }} | ||
labels: | ||
{{- include "onyx-stack.labels" $ | nindent 4 }} | ||
app: {{ $serverType }}-model-server | ||
spec: | ||
scaleTargetRef: | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: {{ include "onyx-stack.fullname" $ }}-{{ $serverType }}-model | ||
pollingInterval: {{ $serverConfig.pollingInterval | default 30 }} | ||
cooldownPeriod: {{ $serverConfig.cooldownPeriod | default 300 }} | ||
minReplicaCount: {{ $serverConfig.minReplicas | default 1 }} | ||
maxReplicaCount: {{ $serverConfig.maxReplicas | default 5 }} | ||
triggers: | ||
- type: cpu | ||
metadata: | ||
type: Utilization | ||
value: {{ $serverConfig.cpuThreshold | default "70" | quote }} | ||
{{- end }} | ||
{{- end }} | ||
{{- end }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
{{- if and .Values.keda.enabled .Values.keda.slackbot .Values.keda.slackbot.enabled }} | ||
# Note: This KEDA ScaledObject works alongside existing HPA using KEDA's HPA mode | ||
# KEDA generates an HPA that can coexist with traditional HPA infrastructure | ||
apiVersion: keda.sh/v1alpha1 | ||
kind: ScaledObject | ||
metadata: | ||
name: {{ include "onyx-stack.fullname" . }}-slackbot-scaledobject | ||
namespace: {{ .Release.Namespace }} | ||
labels: | ||
{{- include "onyx-stack.labels" . | nindent 4 }} | ||
app: slackbot | ||
spec: | ||
scaleTargetRef: | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: {{ include "onyx-stack.fullname" . }}-slackbot | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. logic: Deployment name inconsistency: This targets |
||
pollingInterval: {{ .Values.keda.slackbot.pollingInterval | default 30 }} | ||
cooldownPeriod: {{ .Values.keda.slackbot.cooldownPeriod | default 300 }} | ||
minReplicaCount: {{ .Values.keda.slackbot.minReplicas | default 1 }} | ||
maxReplicaCount: {{ .Values.keda.slackbot.maxReplicas | default 3 }} | ||
# Use HPA mode to generate an HPA that works alongside existing HPA infrastructure | ||
hpaName: {{ include "onyx-stack.fullname" . }}-slackbot-keda-hpa | ||
triggers: | ||
- type: cpu | ||
metadata: | ||
type: Utilization | ||
value: {{ .Values.keda.slackbot.cpuThreshold | default "70" | quote }} | ||
{{- if .Values.keda.slackbot.memoryThreshold }} | ||
- type: memory | ||
metadata: | ||
type: Utilization | ||
value: {{ .Values.keda.slackbot.memoryThreshold | quote }} | ||
{{- end }} | ||
{{- end }} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
{{- if and .Values.keda.enabled .Values.keda.webServer .Values.keda.webServer.enabled }} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. logic: Risk of conflict: Both KEDA ScaledObject and existing HPA (webserver.autoscaling.enabled) can target the same deployment simultaneously, causing scaling conflicts. Consider adding mutual exclusion logic. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Guard the ScaledObject so it does not render when the webserver HPA is enabled to prevent conflicting autoscalers on the same Deployment. Prompt for AI agents
|
||
apiVersion: keda.sh/v1alpha1 | ||
kind: ScaledObject | ||
metadata: | ||
name: {{ include "onyx-stack.fullname" . }}-web-server-scaledobject | ||
namespace: {{ .Release.Namespace }} | ||
labels: | ||
{{- include "onyx-stack.labels" . | nindent 4 }} | ||
app: web-server | ||
spec: | ||
scaleTargetRef: | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
name: {{ include "onyx-stack.fullname" . }} | ||
pollingInterval: {{ .Values.keda.webServer.pollingInterval | default 30 }} | ||
cooldownPeriod: {{ .Values.keda.webServer.cooldownPeriod | default 300 }} | ||
minReplicaCount: {{ .Values.keda.webServer.minReplicas | default 1 }} | ||
maxReplicaCount: {{ .Values.keda.webServer.maxReplicas | default 5 }} | ||
# Use HPA mode to generate an HPA that works alongside existing HPA infrastructure | ||
hpaName: {{ include "onyx-stack.fullname" . }}-web-server-keda-hpa | ||
triggers: | ||
- type: cpu | ||
metadata: | ||
type: Utilization | ||
value: {{ .Values.keda.webServer.cpuThreshold | default "70" | quote }} | ||
{{- if .Values.keda.webServer.memoryThreshold }} | ||
- type: memory | ||
metadata: | ||
type: Utilization | ||
value: {{ .Values.keda.webServer.memoryThreshold | quote }} | ||
{{- end }} | ||
{{- end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The triple conditional check will fail if any of the nested values don't exist. Consider adding the corresponding KEDA configuration section to values.yaml to prevent template rendering failures.