Skip to content

feat(code_executors): Add GkeCodeExecutor for sandboxed code execution on GKE #1629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

syangx39
Copy link

Summary

This PR introduces GkeCodeExecutor, a new code executor that provides a secure and scalable method for running LLM-generated code by leveraging GKE Sandbox. It serves as a robust alternative to local or standard containerized executors by leveraging the GKE Sandbox environment, which uses gVisor for workload isolation.

For each code execution request, it dynamically creates an ephemeral Kubernetes Job with a hardened Pod configuration, offering significant security benefits and ensuring that each code execution runs in a clean, isolated environment.

Key Features of GkeCodeExecutor

  • Dynamic Job Creation: Uses the Kubernetes batch/v1 API to create a new Job for each code snippet.
  • Secure Code Mounting: Injects code into the Pod via a temporary ConfigMap, which is mounted to a read-only file.
  • gVisor Sandboxing: Enforces execution within a gvisor runtime for kernel-level isolation.
  • Hardened Security Context: Pods run as non-root with all Linux capabilities dropped and a read-only root filesystem.
  • Resource Management: Applies configurable CPU and memory limits to prevent abuse.
  • Automatic Cleanup: Uses the ttl_seconds_after_finished feature on Jobs for robust, automatic garbage collection of completed Pods and Jobs.
  • Node Scheduling: The executor uses Kubernetes tolerations in its Pod specification. This allows the k8s scheduler to place the execution Pod onto a pre-configured gVisor-enabled node.
  • Module Integration: The GkeCodeExecutor is registered in the code_executors/__init__.py, making it available for use by agents. The ImportError handling is configured to check for the required kubernetes SDK.

Execution Flow:

54zSPHsQTMaBN3m (1)

  1. Agent invokes GkeCodeExecutor with the LLM-generated code.
  2. The GkeCodeExecutor will execute_code – creates a temporary ConfigMap, and then create a k8s Job to run it.
  3. This Job runs a standard python:3.11-slim container. The image is pulled once to the node and cached. The Job will mount the ConfigMap as /app/code.py
  4. The GkeCodeExecutor will monitor the Job to completion, fetch stdout/stderr logs from the container, return CodeExecutionResult to the LlmAgent, and ensure all temp resources are deleted.
  5. The calling agent formats the result and provides a final response to the user. If the result contains error, it will retry up to error_retry_attempts times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant