-
Notifications
You must be signed in to change notification settings - Fork 254
Description
Add configurable imagePullPolicy for Kubernetes Job Runner
Request Type
Feature Request
Work Environment
| Question | Answer |
|---|---|
| OS version (server) | Kubernetes-based deployment |
| Cortex version / git hash | Latest / main branch |
| Package Type | Docker, Kubernetes |
| Deployment Environment | Private/On-premise with Harbor registry |
Problem Description
When deploying Cortex analyzers in a Kubernetes environment with a private image registry (such as Harbor), the current implementation of K8sJobRunnerSrv.scala does not provide a way to configure the imagePullPolicy for the Kubernetes Jobs it creates.
This causes issues in the following scenarios:
-
Private Registry Deployments: When new analyzer versions are pushed to a private registry with the same tag (e.g.,
latest,stable, or environment-specific tags likedev,staging), Kubernetes will not pull the updated image if the local node already has an image with that tag cached. -
Development/Testing Environments: In rapidly iterating development environments, developers frequently push updated analyzer images with the same tag. Without the ability to set
imagePullPolicy: Always, these updates are not picked up automatically. -
CI/CD Pipeline Integration: Modern CI/CD pipelines (e.g., GitLab CI with Harbor) often use consistent tagging strategies (like
${OWNER}-latestorbuild-${COMMIT_SHA}). The inability to force image pulls can lead to stale analyzer versions being executed.
Current Behavior
The K8sJobRunnerSrv.scala creates Kubernetes Jobs without specifying an imagePullPolicy, which defaults to:
IfNotPresent- Only pulls if the image doesn't exist locally- This prevents automatic updates when new images are pushed to the registry with existing tags
Desired Behavior
Add a configurable imagePullPolicy parameter that:
- Can be set via configuration file (
application.conforreference.conf) - Defaults to
IfNotPresentfor backward compatibility - Can be overridden to
Always,IfNotPresent, orNeveras needed - Is applied to the Kubernetes Job container specification
- Can be configured via Helm chart values for Kubernetes deployments
Proposed Solution
1. Modify K8sJobRunnerSrv.scala:
Add an imagePullPolicy parameter to the class constructor and apply it to the Kubernetes Job container spec:
@Singleton
class K8sJobRunnerSrv(
client: DefaultKubernetesClient,
jobBaseDirectory: Path,
persistentVolumeClaimName: Option[String],
imagePullPolicy: String, // Add this parameter
implicit val system: ActorSystem
) {
@Inject()
def this(config: Configuration, system: ActorSystem) =
this(
new DefaultKubernetesClient(),
Paths.get(config.get[String]("job.directory")),
config.getOptional[String]("job.kubernetes.persistentVolumeClaimName"),
config.getOptional[String]("job.kubernetes.imagePullPolicy").getOrElse("IfNotPresent"), // Add this line
system: ActorSystem
)Apply the policy in the run method:
.addNewContainer()
.withName("neuron")
.withImage(dockerImage)
.withImagePullPolicy(imagePullPolicy) // Add this line
.withArgs("/job")2. Update conf/reference.conf:
job {
timeout = 30 minutes
runners = [kubernetes, docker, process]
directory = ${java.io.tmpdir}
dockerDirectory = ${job.directory}
keepJobFolder = false
kubernetes {
# Name of the PersistentVolumeClaim to use for job storage (required for k8s runner)
# persistentVolumeClaimName = "cortex-jobs-pvc"
# Image pull policy for Kubernetes jobs
# Options: Always, IfNotPresent, Never
# Default: IfNotPresent
# Set to "Always" for private registries with frequently updated images
imagePullPolicy = "IfNotPresent"
}
}3. Helm Chart Integration (Optional):
For Kubernetes deployments, this can be exposed via Helm chart values.yaml:
cortex:
kubernetes:
persistentVolumeClaimName: "cortex-jobs-pvc"
imagePullPolicy: "Always" # or "IfNotPresent", "Never"Benefits
- Private Registry Support: Enables proper operation with private registries (Harbor, ECR, ACR, GCR)
- Backward Compatible: Defaults to
IfNotPresentmaintaining current behavior - Flexible Deployment: Different policies can be used for dev/staging/production environments
- CI/CD Friendly: Supports modern continuous deployment workflows
- Industry Standard: Aligns with Kubernetes best practices and common patterns
Use Cases
- Development Environment: Set to
Alwaysto ensure latest analyzer versions are always pulled - Production Environment: Set to
IfNotPresentto reduce registry load and improve startup time - Air-Gapped/Offline: Set to
Neverto require pre-loaded images on all nodes
Implementation Status
This feature has been implemented in a fork and tested successfully with:
- Harbor private registry
- GitLab CI/CD pipeline
- LinCloud Kubernetes environment
Related Documentation
- Kubernetes imagePullPolicy: https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy
- Harbor Integration: https://goharbor.io/docs/
- Fabric8 Kubernetes Client: https://github.com/fabric8io/kubernetes-client
Complementary Information
Configuration Example for Private Registry:
job {
kubernetes {
persistentVolumeClaimName = "cortex-jobs-pvc"
imagePullPolicy = "Always"
}
}Environment Variable Override:
JOB_KUBERNETES_IMAGEPULLPOLICY=AlwaysThis enhancement would significantly improve Cortex's usability in enterprise and private cloud environments where private image registries are the norm.