Skip to content

Fix Thread-safety race condition in AbstractRedisReactiveCommands.getScheduler() #3272

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ori0o0p opened this issue Apr 24, 2025 · 1 comment · May be fixed by #3271
Open

Fix Thread-safety race condition in AbstractRedisReactiveCommands.getScheduler() #3272

ori0o0p opened this issue Apr 24, 2025 · 1 comment · May be fixed by #3271
Labels
status: waiting-for-feedback We need additional information before we can continue

Comments

@ori0o0p
Copy link
Contributor

ori0o0p commented Apr 24, 2025

Feature Request

(Self-check: A quick search for "getScheduler thread safety" or similar terms in Lettuce issues didn't reveal an identical report, but thorough checking by maintainers is advised.)

Is your feature request related to a problem? Please describe

Yes, the current implementation of the private EventExecutorGroup getScheduler() method within io.lettuce.core.AbstractRedisReactiveCommands, while using a volatile field, is still not fully thread-safe regarding its lazy initialization logic.

The scheduler field is correctly marked as volatile, ensuring visibility across threads. However, the method employs a "check-then-act" pattern (if (scheduler == null) { /* initialize and assign */ }) which is not atomic.

In a multi-threaded environment:

  1. Multiple threads can read the volatile scheduler field and see null.
  2. They can all pass the if (scheduler == null) check before any thread performs the assignment.
  3. Consequently, the initialization logic (determining schedulerToUse, potentially calling connection.getResources().eventExecutorGroup()) can execute multiple times across different threads.
  4. Multiple threads might attempt the final assignment (this.scheduler = schedulerToUse), leading to redundant work and potentially overwriting each other's results.

While volatile guarantees that writes become visible, it does not provide the mutual exclusion needed to make the entire check-initialize-assign sequence atomic. This can lead to inefficient resource usage or subtle concurrency issues.

Describe the solution you'd like

To ensure the initialization logic runs exactly once and the assignment is performed atomically with the check, the Double-Checked Locking (DCL) pattern should be fully implemented using a synchronized block. The volatile keyword (already present) is a prerequisite for DCL's correctness.

    // volatile is already present and correct
    private volatile EventExecutorGroup scheduler;

    private EventExecutorGroup getScheduler() {
        // First check (no locking) - leverages volatile read
        EventExecutorGroup result = scheduler;
        if (result == null) {
            // Synchronize to make the check-and-set atomic
            synchronized (this) {
                // Second check (within synchronized block)
                result = scheduler;
                if (result == null) {
                    // Determine the scheduler to use (existing logic)
                    if (connection.getOptions().isPublishOnScheduler()) {
                        result = connection.getResources().eventExecutorGroup();
                    } else {
                        result = ImmediateEventExecutor.INSTANCE;
                    }
                    // Assign to the volatile field within the synchronized block
                    scheduler = result;
                }
            }
        }
        return result;
    }

This approach ensures:

  1. Atomicity: The synchronized block guarantees that the critical section (the second check and the initialization/assignment) is executed by only one thread at a time, ensuring the initialization logic runs at most once.
  2. Visibility: The combination of the volatile field and the memory semantics of the synchronized block ensures correct publication and visibility of the initialized scheduler across threads.

Considered Drawbacks: Minimal performance impact compared to the current partially thread-safe version, significantly outweighed by the correctness guarantee.

Describe alternatives you've considered

  1. Using java.util.concurrent.atomic.AtomicReference<EventExecutorGroup>:
    • Pros: Can be simpler than manual DCL. Leverages hardware atomic operations.
    • Cons: The compareAndSet loop pattern might still execute the scheduler determination logic multiple times concurrently before one succeeds. If connection.getResources().eventExecutorGroup() creates a new resource each time, it could lead to temporary resource creation that needs cleanup. DCL avoids this by ensuring the creation logic runs only once within the lock.
  2. Synchronizing the entire method:
    • private synchronized EventExecutorGroup getScheduler() { ... }
    • Pros: Simple fix.
    • Cons: Less performant under contention, as it acquires the lock even for reads after the scheduler has been initialized. DCL avoids locking for reads once initialized.
  3. Initializing in the constructor:
    • Pros: Simplest and inherently thread-safe if possible.
    • Cons: May not be feasible depending on dependency availability during construction. Defeats the purpose of lazy initialization.

Given the need to ensure the initialization logic runs only once, DCL appears to be the most suitable pattern here, correctly utilizing the existing volatile field with the necessary synchronization.

Teachability, Documentation, Adoption, Migration Strategy

This is primarily an internal implementation fix to improve robustness and thread safety.

  • Teachability: For maintainers, the DCL pattern is a well-known concurrency pattern. Adding/updating a code comment explaining its use and the necessity of the synchronized block (despite volatile) would be beneficial.
  • Documentation: No external user-facing documentation changes are needed.
  • Adoption/Migration: Users do not need to change their code. They will benefit from improved correctness in future releases.
@ori0o0p ori0o0p linked a pull request Apr 24, 2025 that will close this issue
4 tasks
@tishun
Copy link
Collaborator

tishun commented Apr 25, 2025

Hey @ori0o0p ,

I am afraid I must disagree with this proposal.

In the worst case scenario, that you are describing, we would have two (or more) different threads attempt to assign the same instance to the same field. Although redundant this is not invalid and is virtually undetectable performance-wize.

In the same time a synchronization is an expensive operation even if the synchronization is happening only once during initialization.

Overall - unless there is some actual possibility of reaching an invalid state - I think we should not revise this logic.

@tishun tishun added the status: waiting-for-feedback We need additional information before we can continue label Apr 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: waiting-for-feedback We need additional information before we can continue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants