Skip to content

Conversation

justin-tahara
Copy link
Contributor

@justin-tahara justin-tahara commented Sep 9, 2025

Description

[Provide a brief description of the changes in this PR]
Introducing KEDA to all services. Historically we have had HPA's that were configured for the pods but this is a simple setup that can only be used for Memory and CPU related issues.

With KEDA we should be able to allow for custom triggers and this will allow us to autoscale more properly and be more prepared for Production use cases.

How Has This Been Tested?

[Describe the tests you ran to verify your changes]
Locally ran helm template . commands to test and validate that the charts were created properly.

We added default values and got rid of or moved the templates that were disabled and not used anymore.

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@justin-tahara justin-tahara requested a review from a team as a code owner September 9, 2025 02:33
Copy link

vercel bot commented Sep 9, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 16, 2025 8:53pm

greptile-apps[bot]

This comment was marked as outdated.

@justin-tahara
Copy link
Contributor Author

@greptileai

greptile-apps[bot]

This comment was marked as outdated.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 issues found across 12 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

@justin-tahara justin-tahara requested a review from Weves September 10, 2025 00:47
@justin-tahara justin-tahara force-pushed the jtahara/helm-keda-migration branch from 1845437 to fa4de10 Compare September 16, 2025 20:49
@justin-tahara
Copy link
Contributor Author

@greptileai

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR implements a comprehensive migration from Kubernetes HPA (Horizontal Pod Autoscaler) to KEDA (Kubernetes Event-Driven Autoscaler) across all services in the Onyx application. The migration replaces basic CPU/memory-based HPA configurations with more sophisticated KEDA ScaledObjects that support custom triggers and advanced scaling behaviors.

The changes include:

  • Moving existing HPA templates (webserver-hpa.yaml and api-hpa.yaml) to the templates_disabled/ directory as backup references
  • Creating new KEDA ScaledObject templates for all services: webserver, API, and six Celery worker services (primary, heavy, monitoring, docprocessing, light, user-files-indexing, docfetching)
  • Adding KEDA-specific configuration options to values.yaml including pollingInterval (30s), cooldownPeriod (300s), idleReplicaCount (1), failureThreshold (3), fallbackReplicas (1), and a customTriggers array
  • Incrementing the Helm chart version from 0.2.11 to 0.2.12

KEDA provides significant advantages over HPA by supporting custom scaling triggers beyond CPU/memory metrics, such as queue depth, external metrics, and event-driven scaling. The new ScaledObjects include enhanced configuration options like idle replica management, fallback mechanisms, and customizable polling/cooldown periods. The migration maintains backward compatibility by preserving existing HPA configuration fields while adding KEDA-specific parameters. The customTriggers array enables future configuration of advanced scaling scenarios without requiring template changes.

Confidence score: 2/5

  • This PR introduces significant breaking changes with critical configuration issues that will prevent proper autoscaling functionality
  • Score lowered due to missing required KEDA fields, inconsistent default values, deprecated configurations, and potential empty triggers sections across multiple ScaledObject templates
  • Pay close attention to all KEDA ScaledObject template files, particularly scaleTargetRef configurations and trigger definitions

13 files reviewed, 1 comment

Edit Code Review Bot Settings | Greptile

Comment on lines +21 to +36
triggers:
{{- if .Values.celery_worker_heavy.autoscaling.targetCPUUtilizationPercentage }}
- type: cpu
metricType: Utilization
metadata:
value: "{{ .Values.celery_worker_heavy.autoscaling.targetCPUUtilizationPercentage }}"
{{- end }}
{{- if .Values.celery_worker_heavy.autoscaling.targetMemoryUtilizationPercentage }}
- type: memory
metricType: Utilization
metadata:
value: "{{ .Values.celery_worker_heavy.autoscaling.targetMemoryUtilizationPercentage }}"
{{- end }}
{{- if .Values.celery_worker_heavy.autoscaling.customTriggers }}
{{- toYaml .Values.celery_worker_heavy.autoscaling.customTriggers | nindent 4 }}
{{- end }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Empty triggers section will cause KEDA ScaledObject validation failure - ensure at least one trigger is always configured or add validation

@justin-tahara justin-tahara merged commit 495d4ca into main Sep 16, 2025
21 of 23 checks passed
@justin-tahara justin-tahara deleted the jtahara/helm-keda-migration branch September 16, 2025 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant