Skip to content

Conversation

Weves
Copy link
Contributor

@Weves Weves commented Jul 13, 2025

Description

[Provide a brief description of the changes in this PR]

How Has This Been Tested?

[Describe the tests you ran to verify your changes]

Backporting (check the box to trigger backport action)

Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.

  • This PR should be backported (make sure to check that the backport attempt succeeds)
  • [Optional] Override Linear Check

@Weves Weves requested a review from a team as a code owner July 13, 2025 23:01
Copy link

vercel bot commented Jul 13, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
internal-search Ready Ready Preview Comment Sep 10, 2025 5:56pm

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR introduces parallelization to the Model Integration Tests (MIT) workflow by implementing a matrix strategy in GitHub Actions. The key changes include:

  1. Adding a new job discover-test-dirs that dynamically identifies test directories
  2. Implementing a matrix strategy to run tests in parallel
  3. Modifying the log artifact collection to be test-directory specific

These changes maintain the existing test functionality while potentially reducing the overall execution time significantly. The parallelization is implemented in a way that maintains isolation between test suites and improves debugging capabilities through directory-specific logging.

Confidence score: 4/5

  1. This PR is safe to merge with standard review attention
  2. The changes are well-structured, maintain existing functionality, and follow GitHub Actions best practices
  3. The .github/workflows/pr-mit-integration-tests.yml file needs careful review, particularly around the matrix strategy implementation and log collection modifications

1 file reviewed, 1 comment
Edit PR Review Bot Settings | Greptile

@Weves Weves force-pushed the parallelize-it-simple branch from cb29112 to e66e860 Compare August 28, 2025 06:25
@Weves Weves force-pushed the parallelize-it-simple branch from 666a7e9 to a05652e Compare August 28, 2025 22:25
@Weves Weves force-pushed the parallelize-it-simple branch from a05652e to 3c84914 Compare August 29, 2025 01:17
@Weves Weves force-pushed the parallelize-it-simple branch from 62baad4 to f317341 Compare September 1, 2025 02:31
@Weves Weves changed the title Test matrix feat: parallelized integration tests Sep 1, 2025
@justin-tahara
Copy link
Contributor

@greptileai

@justin-tahara justin-tahara self-requested a review September 9, 2025 23:36
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR transforms the MIT integration test workflow from a sequential, monolithic approach to a parallelized matrix-based system. The key architectural changes include:

Dynamic Test Discovery: The workflow now automatically discovers test directories using find commands and builds a JSON matrix of test paths, eliminating the need to manually maintain test configurations.

Parallelized Build Process: The original single build step is split into separate parallel jobs (prepare-build, build-backend-image, build-model-server-image, build-integration-image) that can execute simultaneously, reducing overall build time.

Infrastructure Modernization: The workflow migrates from GitHub-hosted runners to Blacksmith runners and introduces AWS ECR for Docker image storage. Images are built once, pushed to ECR, then pulled by each matrix job for test execution.

Matrix-Based Test Execution: Tests now run in parallel across discovered test directories using GitHub Actions' matrix strategy, with each test directory executing on its own runner instance.

Enhanced Reliability: Retry logic is added using nick-fields/retry@v3 to handle test flakiness, and the workflow includes improved error handling and logging.

The custom Docker build action is also extended with new optional parameters (outputs, provenance, build-args) to support the more sophisticated build requirements of the parallelized system. This change integrates with the existing codebase by maintaining the same test execution patterns while dramatically improving performance through concurrent execution.

Confidence score: 4/5

  • This PR introduces significant architectural improvements but adds infrastructure complexity that requires careful validation
  • Score reflects well-structured parallelization approach but concerns about AWS dependencies and runner coordination
  • Pay close attention to the test discovery logic and matrix configuration in the workflow files

2 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Contributor

@justin-tahara justin-tahara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving some initial comments and questions. I think if it looks good we should merge and start testing the other tests as well so that we reduce CI time across the board. Thanks for doing this Chris!

# tag every docker image with "test" so that we can spin up the correct set
# of images during testing
build-backend-image:
runs-on: blacksmith-16vcpu-ubuntu-2404
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we saying we need a larger instance because of how much CPU cores are needed for building the image?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we used to use 32cpu-linux-x64, so this is smaller. Not sure if we actually need this large of an instance though 🤔


- name: Build Model Server Docker image
uses: ./.github/actions/custom-build-and-push
tags: ${{ env.ECR_REGISTRY }}/integration-test-onyx-backend:test-${{ github.run_id }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this ECR Registry get updated with our backend images? Do we need to do this for all images and move them over to ECR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also build them + push them in this action. For our public images, we can still use dockerhub.

It's nicer imo to just use one container repo across the board, but dockerhub seems to be throttling us for these high bandwidth use cases :(

Weves and others added 16 commits September 10, 2025 10:22
- Replace all GitHub runners with Blacksmith workers
- Update Docker build steps to use Blacksmith's native caching
- Replace docker/setup-buildx-action with useblacksmith/setup-docker-builder
- Replace custom-build-and-push with useblacksmith/build-push-action
- Remove explicit cache-from and cache-to directives (handled automatically by Blacksmith)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@Weves Weves force-pushed the parallelize-it-simple branch from 2d840be to e0aed1f Compare September 10, 2025 17:34
Copy link
Contributor

@justin-tahara justin-tahara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the comments!

@Weves Weves merged commit d1f7cee into main Sep 10, 2025
50 checks passed
@Weves Weves deleted the parallelize-it-simple branch September 10, 2025 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants