feat(code_executors): Add GkeCodeExecutor for sandboxed code execution on GKE #1629
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces
GkeCodeExecutor
, a new code executor that provides a secure and scalable method for running LLM-generated code by leveraging GKE Sandbox. It serves as a robust alternative to local or standard containerized executors by leveraging the GKE Sandbox environment, which uses gVisor for workload isolation.For each code execution request, it dynamically creates an ephemeral Kubernetes Job with a hardened Pod configuration, offering significant security benefits and ensuring that each code execution runs in a clean, isolated environment.
Key Features of GkeCodeExecutor
batch/v1
API to create a new Job for each code snippet.ConfigMap
, which is mounted to a read-only file.gvisor
runtime for kernel-level isolation.ttl_seconds_after_finished
feature on Jobs for robust, automatic garbage collection of completed Pods and Jobs.tolerations
in its Pod specification. This allows the k8s scheduler to place the execution Pod onto a pre-configured gVisor-enabled node.GkeCodeExecutor
is registered in thecode_executors/__init__.py
, making it available for use by agents. TheImportError
handling is configured to check for the requiredkubernetes
SDK.Execution Flow:
GkeCodeExecutor
with the LLM-generated code.GkeCodeExecutor
willexecute_code
– creates a temporaryConfigMap
, and then create a k8sJob
to run it.python:3.11-slim
container. The image is pulled once to the node and cached. The Job will mount the ConfigMap as/app/code.py
stdout/stderr
logs from the container, returnCodeExecutionResult
to the LlmAgent, and ensure all temp resources are deleted.