Skip to content

Fix: Robot Loading Unit Test Fail [AARD-2003] #1222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: dev
Choose a base branch
from

Conversation

AlexD717
Copy link
Member

@AlexD717 AlexD717 commented Jul 17, 2025

Task

Fix a Unit Tests that was failing for previously unknown reasons.

AARD-2003

Symptom

When running unit tests for the first time (robots not cached) this error occurs
Screenshot 2025-07-17 at 10 13 02 AM

Note: This error only occurs in the Firefox specific environment, and rerunning the tests causes it to pass.

Solution

The error was caused by a race condition where multiple test files where trying to fetch the same robot file. To fix this support for multiple requests was added to MirabufLoader. Also, if a robot/field fails to cache it still adds the robot/field file to the local memory, allowing the user to spawn and use the robot (however refreshing the browser would cause this robot to disappear).

Verification

  • Unit tests pass

Before merging, ensure the following criteria are met:

  • All acceptance criteria outlined in the ticket are met.
  • Necessary test cases have been added and updated.
  • A feature toggle or safe disable path has been added (if applicable).
  • User-facing polish:
    • Ask: "Is this ready-looking?"
  • Cross-linking between Jira and GitHub:
    • PR links to the relevant Jira issue.
    • Jira ticket has a comment referencing this PR.

@AlexD717 AlexD717 self-assigned this Jul 17, 2025
@AlexD717 AlexD717 requested review from a team as code owners July 17, 2025 17:27
@AlexD717 AlexD717 added the hotfix Pull requests that should be resolved before the next snapshot/release label Jul 17, 2025
@AlexD717 AlexD717 changed the title Hotfix: Robot Loading Unit Test Fail Hotfix: Robot Loading Unit Test Fail [AARD-2003] Jul 17, 2025
@AlexD717 AlexD717 marked this pull request as draft July 17, 2025 17:31
@AlexD717 AlexD717 marked this pull request as ready for review July 17, 2025 17:59
@AlexD717
Copy link
Member Author

Note: Testing Synthesis in Firefox shows that multiple console warning appear, but the program works overall.
Screenshot 2025-07-17 at 11 00 52 AM
(These errors appear both in the dev and production version)
We might want to consider looking further into if we want to support Firefox at all and other errors that can appear there.

@rutmanz
Copy link
Member

rutmanz commented Jul 17, 2025

Given that the tests are passing again on dev other branches, the issue isn't fully reproducible. How do you know that this is the cause of the issue?

@ryanzhangofficial
Copy link
Member

I was receiving the same errors when running unit tests for the Mirabuf Testing ticket. If you run the tests a second time, weirdly the errors do not show up anymore. I second the question on root cause and reproducibility.

@AlexD717 AlexD717 marked this pull request as draft July 17, 2025 18:27
@AlexD717
Copy link
Member Author

After implementing all of these changes, I have rerun the tests multiple times and they have passed every time. I don't really know any better way to verify that this works.

@AlexD717 AlexD717 marked this pull request as ready for review July 17, 2025 21:44
@BrandonPacewic
Copy link
Member

Given that the tests are passing again on dev other branches, the issue isn't fully reproducible. How do you know that this is the cause of the issue?

@rutmanz They only pass after re running the workflow. This is a bit of a weird issue. If the action passes after the first run on a fresh commit then this fix works.

@rutmanz
Copy link
Member

rutmanz commented Jul 18, 2025

@BrandonPacewic No state is shared between workflow attempts (or no more than is shared between first attempts of consecutive commits) as far as I'm aware

@AlexD717
Copy link
Member Author

No state is shared between workflow attempts (or no more than is shared between first attempts of consecutive commits) as far as I'm aware

I might be wrong, but I think that the robot/field elements might be getting cached and persist across test runs which results in the second rerun succeeding.

@Dhruv-0-Arora
Copy link
Collaborator

Is this PR stale after #1225

@PepperLola
Copy link
Member

Is this PR stale after #1225

I don't think so? It solves a race condition when multiple tests try to access the same asset, which would still be present regardless of where it's loading from. It's possible that the issue will be less common now that we're not downloading the assets from such a slow server (and thus importing and caching will take less time) but I imagine the issue would still be there.

@AlexD717 AlexD717 changed the title Hotfix: Robot Loading Unit Test Fail [AARD-2003] Robot Loading Unit Test Fail [AARD-2003] Jul 18, 2025
@AlexD717 AlexD717 changed the title Robot Loading Unit Test Fail [AARD-2003] Fix: Robot Loading Unit Test Fail [AARD-2003] Jul 18, 2025
@rutmanz
Copy link
Member

rutmanz commented Jul 21, 2025

I guess because of the reproducibility issues, I'm concerned that this isn't a/the problem. Are you able to induce the issue that this solved with your own script that loads them in parallel?

@AlexD717
Copy link
Member Author

I guess because of the reproducibility issues, I'm concerned that this isn't a/the problem. Are you able to induce the issue that this solved with your own script that loads them in parallel?

I did verify that spawning robots in parallel after my changes works, and I have rerun the tests multiple times with the tests passing each time.

@rutmanz
Copy link
Member

rutmanz commented Jul 21, 2025

I did verify that spawning robots in parallel after my changes works, and I have rerun the tests multiple times with the tests passing each time.

@AlexD717 Are you able to verify that spawning robots in parallel without your changes doesn't work? (i.e. it isn't something else that was causing the issue)

@AlexD717
Copy link
Member Author

I did verify that spawning robots in parallel after my changes works, and I have rerun the tests multiple times with the tests passing each time.

@AlexD717 Are you able to verify that spawning robots in parallel without your changes doesn't work? (i.e. it isn't something else that was causing the issue)

I don't think I can double check right now as the server to download robot files is currently down. However, unit tests were failing, I implemented these changes, unit tests started passing. This was before I merged in dev to use git lfs instead of downloading the files.

@rutmanz
Copy link
Member

rutmanz commented Jul 21, 2025

But unit tests started passing everywhere at around the same time and this doesn't seem like the obvious cause of/solution to that issue to me. You should still be able to induce the race condition (if that's what actually caused that test to fail) by mocking fetch and making requests take an extra 3s (or whatever is required), then loading two files simultaneously in a throwaway test.ts file

@AlexD717
Copy link
Member Author

But unit tests started passing everywhere at around the same time and this doesn't seem like the obvious cause of/solution to that issue to me. You should still be able to induce the race condition (if that's what actually caused that test to fail) by mocking fetch and making requests take an extra 3s (or whatever is required), then loading two files simultaneously in a throwaway test.ts file

The unit tests started passing because we started re-running them. Also after you updated it to use git lfs all unit tests started passing on the first attempt. I made sure that the unit tests passed on the first run and on all subsequent re-runs in this branch. I also only merged in dev with the git lfs change after I got the unit tests to properly run on this branch.

I can look into making a throwaway test.ts file like you suggested, but I am unsure how useful that would be because of the things I mentioned above and since that would require me to run the tests in the online GitHub environment since that was the only place they were happening.

@AlexD717 AlexD717 marked this pull request as draft July 21, 2025 23:11
@Dhruv-0-Arora
Copy link
Collaborator

Note

The bug this PR was supposed to fix was resolved by #1225 . Since we are not sure of the purpose/need of this PR, we will revisit it at the end of the summer or if the issue resurfaces.

Copy link
Member

@BrandonPacewic BrandonPacewic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue just resurfaced after the merge of #1245.
image

It was resolved by re-running the unit tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotfix Pull requests that should be resolved before the next snapshot/release no-merge testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants