Skip to content

fix(amazonq): avoid workspace index process failure #5595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 17, 2025
Merged

Conversation

leigaol
Copy link
Contributor

@leigaol leigaol commented Apr 17, 2025

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Description

Orphaned http request that should be rejected somehow sneaked into the http request event loop while the FAISS index is not ready, causing the workspace LSP process to terminate, which causes the JetBrains IDE to re-initialize the workspace LSP process, which further triggers an infinite loop of log storm that caused slowness (the log loop issue is fixed in #5581).

Here are the sequence of events that happened:

  1. JB starts workspace LSP, the LSP then works on tree sitter parsing to generate repomap.
  2. When Upgrade Kotlin, IDE, Gradle plugin, fix Toolwindow #1 is in progress, client (user) uses @workspace feature sends a request for vector index query. Upgrade Kotlin, IDE, Gradle plugin, fix Toolwindow #1 is usually fast but for 1.4GB repo like https://github.com/elastic/elasticsearch (1.4GB), it takes 6 min.
  3. Node js event loop busy, client request Cred management #2 is timed out. However, requests is cached at server and it becomes an Orphaned http request.
  4. The moment when tree sitter parsing is done, node js event loop SOMEHOW immediately handles the Orphaned request in step 2 at a certain possibility!
  5. The vector index is not undefined, it was partially initialized, but it had no chunk inside, query when 0 chunks caused Faiss to crash, which terminated the LSP process.
  6. JB saw java.net.ConnectException: Connection refused, it then forces LSP to restart, which restarts the indexing, causing performance issue.

Checklist

  • My code follows the code style of this project
  • I have added tests to cover my changes
  • A short description of the change has been added to the CHANGELOG if the change is customer-facing in the IDE.
  • I have added metrics for my changes (if required)

License

I confirm that my contribution is made under the terms of the Apache 2.0 license.

@leigaol leigaol requested a review from a team as a code owner April 17, 2025 15:38
@rli rli merged commit 32c2ef6 into aws:main Apr 17, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants