-
Notifications
You must be signed in to change notification settings - Fork 1.9k
feat: playwright test speed improvement #5388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR refactors the Playwright test CI workflow to significantly improve performance through a new architecture that prebuilds Docker images in separate jobs and reuses them via ECR. The key changes include:
Architecture Changes:
- Parallel Image Building: Three new jobs (
build-web-image
,build-backend-image
,build-model-server-image
) build Docker images in parallel and push them to ECR instead of building locally during test execution - Image Reuse Strategy: The main test job (
playwright-tests
) now pulls pre-built images from ECR in parallel using background processes, then retags them for docker-compose compatibility - Runner Optimization: Switches from generic GitHub runners to ARM64-specific
blacksmith-8vcpu-ubuntu-2404-arm
runners for all jobs
Removed Dependencies:
- Eliminates local Docker image building during test execution
- Removes Docker Hub authentication and S3 cache usage
- Removes Python dependency installation and setup steps
- Removes Docker Buildx setup from the test job
ECR Integration:
The workflow now uses AWS ECR as a central image registry, with each build job authenticating to ECR using AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
secrets, then pushing images with commit SHA tags for uniqueness.
This approach follows the CI optimization pattern of "build once, use many times" - expensive Docker image building happens in parallel dedicated jobs, while the test execution job focuses solely on running tests with pre-built artifacts. The parallel image pulling strategy should further reduce test startup time by maximizing network throughput.
Confidence score: 2/5
- This PR has significant risks due to missing error handling, hardcoded secrets, and potential race conditions in the new ECR-based architecture
- Score reflects complex changes to critical CI infrastructure with multiple failure points and insufficient error handling for ECR operations
- Pay close attention to ECR authentication, image tagging consistency, and the parallel image pulling implementation
1 file reviewed, 1 comment
- name: Pull Docker images | ||
run: | | ||
# Pull all images from ECR in parallel | ||
echo "Pulling Docker images in parallel..." | ||
(docker pull ${{ env.ECR_REGISTRY }}/integration-test-onyx-web-server:playwright-test-${{ github.run_id }}) & | ||
(docker pull ${{ env.ECR_REGISTRY }}/integration-test-onyx-backend:playwright-test-${{ github.run_id }}) & | ||
(docker pull ${{ env.ECR_REGISTRY }}/integration-test-onyx-model-server:playwright-test-${{ github.run_id }}) & | ||
# Wait for all background jobs to complete | ||
wait |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Background processes in shell scripts can fail silently. Consider adding error checking after the wait
command to ensure all pulls completed successfully.
10 Jobs Failed: Run MIT Integration Tests v2 / integration-tests-mit (connector_job_tests/slack, connector-slack)Step "Start Docker containers" from job "integration-tests-mit (connector_job_tests/slack, connector-slack)" is failing. The last 20 log lines are:
Run MIT Integration Tests v2 / integration-tests-mit (tests/image_indexing, tests-image_indexing)Step "Start Mock Services" from job "integration-tests-mit (tests/image_indexing, tests-image_indexing)" is failing. The last 20 log lines are:
Run MIT Integration Tests v2 / integration-tests-mit (tests/index_attempt, tests-index_attempt)Step "Start Docker containers" from job "integration-tests-mit (tests/index_attempt, tests-index_attempt)" is failing. The last 20 log lines are:
Run MIT Integration Tests v2 / integration-tests-mit (tests/indexing, tests-indexing)Step "Start Docker containers" from job "integration-tests-mit (tests/indexing, tests-indexing)" is failing. The last 20 log lines are:
Run MIT Integration Tests v2 / integration-tests-mit (tests/kg, tests-kg)Step "Start Docker containers" from job "integration-tests-mit (tests/kg, tests-kg)" is failing. The last 20 log lines are:
Run MIT Integration Tests v2 / integration-tests-mit (tests/llm_provider, tests-llm_provider)Step "Start Docker containers" from job "integration-tests-mit (tests/llm_provider, tests-llm_provider)" is failing. The last 20 log lines are:
Run MIT Integration Tests v2 / integration-tests-mit (tests/migrations, tests-migrations)Step "Start Docker containers" from job "integration-tests-mit (tests/migrations, tests-migrations)" is failing. The last 20 log lines are:
Run MIT Integration Tests v2 / integration-tests-mit (tests/permissions, tests-permissions)Step "Start Docker containers" from job "integration-tests-mit (tests/permissions, tests-permissions)" is failing. The last 20 log lines are:
Run MIT Integration Tests v2 / integration-tests-mit (tests/personas, tests-personas)Step "Start Docker containers" from job "integration-tests-mit (tests/personas, tests-personas)" is failing. The last 20 log lines are:
1 job failed running on non-Blacksmith runners. Summary: 8 successful workflows, 2 failed workflows
Last updated: 2025-09-11 01:47:44 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 1 file
Description
[Provide a brief description of the changes in this PR]
How Has This Been Tested?
[Describe the tests you ran to verify your changes]
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.
Summary by cubic
Prebuilds Docker images for Playwright tests in separate jobs and reuses them via ECR to speed up CI. The test job now pulls images in parallel instead of building locally.