Skip to content

Conversation

mxsm
Copy link
Owner

@mxsm mxsm commented Sep 11, 2025

Which Issue(s) This PR Fixes(Closes)

Fixes #4020

Brief Description

How Did You Test This Change?

Summary by CodeRabbit

  • Bug Fixes

    • More reliable startup readiness handling that honors configured wait time, reducing premature “not ready” errors.
    • Standardized error messages to “name server not ready” and clearer “No topic route info…” wording.
  • Refactor

    • Simplified topic route data encoding for consistent, predictable responses.

@Copilot Copilot AI review requested due to automatic review settings September 11, 2025 06:27
Copy link
Contributor

coderabbitai bot commented Sep 11, 2025

Walkthrough

The name server readiness check now uses explicit elapsed time against a configured wait_seconds and an additional need_wait_for_service flag. Log and error messages were updated for clarity. Topic route encoding always uses standard JSON serialization with direct response body setting.

Changes

Cohort / File(s) Summary
Readiness gating & messaging
rocketmq-namesrv/src/processor/client_request_processor.rs
- Replaced Duration-based check with elapsed_millis vs. config wait_seconds
- Added need_wait_for_service to gate readiness check and early return with SystemError "name server not ready"
- Updated log/remark phrases ("name server")
- Standardized topic-not-found message
- Simplified topic route encoding to always use topic_route_data.encode() and set body

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Client
    participant Processor as ClientRequestProcessor
    participant Cfg as Config
    participant Clock as Time Source
    participant NS as Name Server State

    Client->>Processor: Request topic route
    Processor->>Cfg: Read need_wait_for_service, wait_seconds
    Processor->>Clock: Compute elapsed_millis since start
    alt need_wait_for_service && elapsed_millis < wait_seconds*1000
        Processor-->>Client: SystemError ("name server not ready")
    else Ready
        Processor->>NS: Fetch topic route data
        alt Route not found
            Processor-->>Client: Error ("No topic route info in name server for the topic: ...")
        else Route found
            note right of Processor: Encode via standard JSON serialization
            Processor-->>Client: Response with encoded route data
        end
    end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks (3 passed, 2 warnings)

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning While the readiness gating and log-message changes are in scope, the PR also modifies topic route-data serialization by removing the version/standard_json_only branching and always using topic_route_data.encode(), which is not described in issue #4020 and may affect serialization behavior; the PR contains no explanation or tests for this change. Because this serialization change is not part of the linked issue's stated objective and lacks justification or validation in the PR, it appears to be an out-of-scope modification. This makes the changeset include at least one non-trivial change outside the linked issue. Please either remove or split the serialization/encoding refactor into a separate PR with a clear rationale and tests proving no behavioral regression, or add a detailed justification and accompanying tests to this PR before merging.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title directly summarizes the primary change—improving name server readiness checks and clarifying log messages—and references the linked issue #4020, so it accurately reflects the main intent of the changeset; it is concise and specific enough for a reviewer to understand the primary purpose. The presence of an emoji and the leading issue token is stylistic noise but does not make the title misleading. Overall the title is related to and representative of the changes.
Linked Issues Check ✅ Passed The PR implements the linked issue's stated objective by introducing a wait_seconds-based readiness check and a need_wait_for_service gate and by updating log/error messages from "name remoting_server" to "name server", which directly addresses improving readiness checks and clarifying logs. The linked issue contains no additional coding requirements, and the raw_summary shows the code-level changes that align with that objective. There are no apparent mismatches between the PR changes and the issue's stated goal.

Poem

I thump my paws—tick, tock, steady pace,
Now the server waits with measured grace.
Clearer logs hop into view,
“Name server” speaks, precise and true.
Routes encoded, clean and bright—
A rabbit nods: the timing’s right. 🐇⏱️

✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enh-4020

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rocketmq-rust-bot
Copy link
Collaborator

🔊@mxsm 🚀Thanks for your contribution🎉!

💡CodeRabbit(AI) will review your code first🔥!

Note

🚨The code review suggestions from CodeRabbit are to be used as a reference only, and the PR submitter can decide whether to make changes based on their own judgment. Ultimately, the project management personnel will conduct the final code review💥.

@rocketmq-rust-robot rocketmq-rust-robot added the enhancement⚡️ New feature or request label Sep 11, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the name server readiness check implementation and standardizes log messages for better clarity. The changes refactor the readiness check logic to be more readable and consistent while removing commented-out legacy code.

  • Refactored readiness check calculation with clearer variable names and safer arithmetic operations
  • Updated log messages to consistently use "name server" instead of "remoting_server"
  • Removed unused imports and commented-out legacy code

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

} else {
topic_route_data.encode()
};*/
//Rust only support standard JSON serialization
Copy link

Copilot AI Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar error in comment: should be 'supports' instead of 'support'.

Suggested change
//Rust only support standard JSON serialization
//Rust only supports standard JSON serialization

Copilot uses AI. Check for mistakes.

Copy link

codecov bot commented Sep 11, 2025

Codecov Report

❌ Patch coverage is 0% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 26.57%. Comparing base (aef2115) to head (c518a87).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...-namesrv/src/processor/client_request_processor.rs 0.00% 13 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4021   +/-   ##
=======================================
  Coverage   26.57%   26.57%           
=======================================
  Files         565      565           
  Lines       80722    80720    -2     
=======================================
  Hits        21453    21453           
+ Misses      59269    59267    -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Collaborator

@rocketmq-rust-bot rocketmq-rust-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rocketmq-rust-bot rocketmq-rust-bot merged commit 42367c5 into main Sep 11, 2025
21 of 23 checks passed
@rocketmq-rust-bot rocketmq-rust-bot added approved PR has approved and removed ready to review waiting-review waiting review this PR labels Sep 11, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
rocketmq-namesrv/src/processor/client_request_processor.rs (3)

96-107: Improve diagnosability of “not ready” path

Log the elapsed vs. configured wait and echo it in the remark; removes trailing space and eases ops triage.

-        if need_wait_for_service && !namesrv_ready {
-            warn!("name server not ready. request code {} ", request.code());
+        if need_wait_for_service && !namesrv_ready {
+            warn!(
+                "name server not ready: elapsed={}ms < wait={}s (code={})",
+                elapsed_millis, wait_seconds, request.code()
+            );
             return Ok(Some(
                 RemotingCommand::create_response_command_with_code(
                     RemotingSysResponseCode::SystemError,
                 )
-                .set_remark("name server not ready"),
+                .set_remark(format!(
+                    "name server not ready: elapsed={}ms < wait={}s",
+                    elapsed_millis, wait_seconds
+                )),
             ));
         }

116-120: Tighten TopicNotExist remark wording

Minor clarity/readability tweak; quote the topic and add spacing before FAQ text.

-                    .set_remark(format!(
-                        "No topic route info in name server for the topic:{}{}",
-                        request_header.topic,
-                        FAQUrl::suggest_todo(FAQUrl::APPLY_TOPIC_URL)
-                    )),
+                    .set_remark(format!(
+                        "No topic route info in name server for topic '{}'. {}",
+                        request_header.topic,
+                        FAQUrl::suggest_todo(FAQUrl::APPLY_TOPIC_URL)
+                    )),

142-142: Fix comment grammar

-                //Rust only support standard JSON serialization
+                // Rust only supports standard JSON serialization
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aef2115 and c518a87.

📒 Files selected for processing (1)
  • rocketmq-namesrv/src/processor/client_request_processor.rs (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
rocketmq-namesrv/src/processor/client_request_processor.rs (2)
rocketmq-common/src/utils/time_utils.rs (1)
  • get_current_millis (19-24)
rocketmq-remoting/src/protocol/remoting_command.rs (2)
  • code (578-580)
  • create_response_command_with_code (192-194)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: build (ubuntu-latest, stable)
  • GitHub Check: build (ubuntu-latest, nightly)
  • GitHub Check: build (windows-latest, nightly)
  • GitHub Check: build (macos-latest, nightly)
  • GitHub Check: build (macos-latest, stable)
  • GitHub Check: build (windows-latest, stable)
  • GitHub Check: build
  • GitHub Check: test
  • GitHub Check: auto-approve

Comment on lines +87 to +95
let wait_seconds = self
.name_server_runtime_inner
.name_server_config()
.wait_seconds_for_service as u64;
let elapsed_millis =
TimeUtils::get_current_millis().saturating_sub(self.startup_time_millis);
let namesrv_ready = self.need_check_namesrv_ready.load(Ordering::Relaxed)
&& TimeUtils::get_current_millis() - self.startup_time_millis
>= Duration::from_secs(
self.name_server_runtime_inner
.name_server_config()
.wait_seconds_for_service as u64,
)
.as_millis() as u64;
if self
&& elapsed_millis >= wait_seconds * 1000;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix readiness logic: bypass when check is disabled; align atomic ordering; avoid overflow

When need_check_namesrv_ready becomes false, namesrv_ready evaluates to false and the server incorrectly returns “not ready” on subsequent requests. Also, use Acquire to pair with the Release store and guard against u64 overflow.

Apply:

-        let elapsed_millis =
-            TimeUtils::get_current_millis().saturating_sub(self.startup_time_millis);
-        let namesrv_ready = self.need_check_namesrv_ready.load(Ordering::Relaxed)
-            && elapsed_millis >= wait_seconds * 1000;
+        let elapsed_millis =
+            TimeUtils::get_current_millis().saturating_sub(self.startup_time_millis);
+        let should_check = self
+            .need_check_namesrv_ready
+            .load(Ordering::Acquire);
+        let namesrv_ready =
+            !should_check || elapsed_millis >= wait_seconds.saturating_mul(1000);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let wait_seconds = self
.name_server_runtime_inner
.name_server_config()
.wait_seconds_for_service as u64;
let elapsed_millis =
TimeUtils::get_current_millis().saturating_sub(self.startup_time_millis);
let namesrv_ready = self.need_check_namesrv_ready.load(Ordering::Relaxed)
&& TimeUtils::get_current_millis() - self.startup_time_millis
>= Duration::from_secs(
self.name_server_runtime_inner
.name_server_config()
.wait_seconds_for_service as u64,
)
.as_millis() as u64;
if self
&& elapsed_millis >= wait_seconds * 1000;
let wait_seconds = self
.name_server_runtime_inner
.name_server_config()
.wait_seconds_for_service as u64;
let elapsed_millis =
TimeUtils::get_current_millis().saturating_sub(self.startup_time_millis);
let should_check = self
.need_check_namesrv_ready
.load(Ordering::Acquire);
let namesrv_ready =
!should_check || elapsed_millis >= wait_seconds.saturating_mul(1000);
🤖 Prompt for AI Agents
In rocketmq-namesrv/src/processor/client_request_processor.rs around lines 87 to
95, the readiness check wrongly returns false after need_check_namesrv_ready is
set to false, uses Relaxed ordering, and risks u64 overflow when scaling seconds
to milliseconds; change the logic to treat the server as ready immediately when
the check is disabled (i.e., if need_check_namesrv_ready.load(Acquire) == false
then namesrv_ready = true), use Acquire for the atomic load to pair with any
Release store, and compute wait_millis safely (e.g., use checked_mul or
saturating_mul on wait_seconds as u64 to produce milliseconds) before comparing
elapsed_millis to avoid overflow.

@mxsm mxsm deleted the enh-4020 branch September 11, 2025 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI review first Ai review pr first approved PR has approved auto merge enhancement⚡️ New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement⚡️] Improve name server readiness checks and update log messages for clarity

3 participants