Optimize process of thread counter #619
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Analysis
During MySQL's
sysbench point-select
performance testing, the server's CPU utilization approaches 100%. Using the perf top tool, it can be observed that thedispatch_command
function in mysqld is a significant hotspot, accounting for over 20% of the CPU usage. Further analysis ofdispatch_command
reveals two atomic variable increment/decrement operations, each contributing to more than 40% of its execution time. These operations exhibit severe contention and represent clear bottlenecks, as illustrated in Figures 1 to 3.Figure 1 - hotspots

Figure 2 - The first bottleneck in

dispatch_command
Figure 3 - The second bottleneck in

dispatch_command
Optimization
The bottleneck code locations are as follows:
The primary reason these two lines become bottlenecks is due to severe contention caused by high-frequency atomic operations on the same variable.
Upon analyzing the related code, we found that these operations belong to the "running thread count" statistics and query module, which serves two key purposes:
The main workflow of this module is illustrated in Figure 4.
Figure 4 - Current main process of thread counter

The current running thread count statistics and query mechanism works by incrementing/decrementing the atomic counter during SQL statement execution, while queries simply returned the atomic variable's value. This design significantly impacted business performance.
The optimization removes the atomic operations during SQL execution and instead calculates the running thread count only during queries. This trade-off boosts business performance at the cost of slightly slower queries, as illustrated in the optimized workflow (Figure 5).
Figure 5 - The optimized process of thread counter
