-
Notifications
You must be signed in to change notification settings - Fork 120
[BUG] Eliminate False Positive Notifications in Manual Snapshot Policy #1371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@bowenlan-amzn Could you please review this and share your thoughts? I'm particularly interested in your perspective on the proposed changes to SMStateMachine.kt, as you were the original author of this file.
|
seqNo count the indexing operations (index, update, delete) for a shard. For the metadata document we are updating, if we are using multi-thread to update it, that may have out of order update problem, but I think we are not using multi-thread. |
Catch All Triage - 1 2 3 |
Closing as this PR is merged: #1413 |
Uh oh!
There was an error while loading. Please reload this page.
What is the bug?
When a manual snapshot policy runs, it creates and deletes snapshots based on configured cron jobs. These actions update the state in a system index (.ism-config index). However, due to a race condition, this state update can fail. This occurs when a snapshot deletion is in progress and another snapshot creation starts while holding a lock on the system index. When the snapshot deletion completes, it fails to update the metadata in the system index.
index-management/src/main/kotlin/org/opensearch/indexmanagement/snapshotmanagement/SMRunner.kt
Lines 104 to 120 in eb6afa8
Currently, we send a notification to users on metadata update failures. This is a false alarm, as it's an internal error rather than a user-facing issue that requires action.
On metadata update failures we are sending a notification to users. This is a false alarm as this is an internal error instead of user facing issue that user can act upon and fix.
index-management/src/main/kotlin/org/opensearch/indexmanagement/snapshotmanagement/engine/SMStateMachine.kt
Lines 124 to 127 in eb6afa8
How can one reproduce the bug?
Set up a manual snapshot policy with both creation and deletion operations.
Configure a notification channel. Run the policy and observe the notifications.
What is the expected behavior?
The system should not send false positive notifications to users for internal metadata update failures.
Do you have any screenshots?
Do you have any additional context?
Add any other context about the problem.
The text was updated successfully, but these errors were encountered: