Skip to content

tilt ci --output-snapshot-on-exit can race and capture the snapshot before all state is reconciled #6553

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dnephin opened this issue May 15, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@dnephin
Copy link
Contributor

dnephin commented May 15, 2025

Expected Behavior

I'm expecting that the snapshot saved by --output-snapshot-on-exit contains all the relevant logs and status about the failure.

Current Behavior

In some cases, let's say roughly 1 in 20 failures, we've noticed that the command can fail, but the snapshot doesn't reflect that failure. Two concrete examples:

Example 1 - The tilt ci run fails with error Error: Custom build "custom-build-cmd" failed: exit status 1. When I open the snapshot I don't see any tiles on the left marked as failed. If I look through every one and I eventually find the build failure. It's status is:

    "runtimeStatus": "pending",
    "updateStatus": "in_progress",

The logs do show the failure.

Example 2 - The tilt ci run fails with Error: exceeded grace period: Pod "some-test-gpv77" failed. This time the runtimeStatus is correctly "error", but the logs are incomplete. It's not that they are truncated due to the buffer. The final logs that contain the error message are what is missing (not earlier logs).

Steps to Reproduce

Other than running a very large number of tilt ci runs on a CI worker I'm not sure how to reliability reproduce this. I assume it's a race condition where the shutdown happens too early before reconciling all the necessary events.

Context

Observed on v0.33.21, not sure when it started. I'll be upgrading to the latest version now, but I assume it hasn't changed since.

About Your Use Case

We use tilt ci in CI to run an environment for end-to-end testing.

@dnephin dnephin added the bug Something isn't working label May 15, 2025
@dnephin dnephin changed the title tilt ci --output-snapshot-on-exit captures the snapshot before all state is reconciled tilt ci --output-snapshot-on-exit can race and capture the snapshot before all state is reconciled May 15, 2025
@dnephin
Copy link
Contributor Author

dnephin commented May 15, 2025

I'd be happy to submit a patch for this if you can point me in the right direction (specific files or packages to look at). I'm also happy to run a pre-release build to see if we can reproduce the issue with a patch applied.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant