-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Support sharding in RedisRateLimiter #3780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
int getShardedReplenishRate(Config config) { | ||
int replenishRate = config.getReplenishRate(); | ||
if (config.getShards() > 0) { | ||
replenishRate = replenishRate / config.getShards(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if replenishRate
is could not be divisible by shards, the actual rate will less than replenishRate
, reminder value should be considered.
Very useful feature. The purpose of rate limiting is to protect downstream services from overload. However, in such as e-commerce platforms promotion scenarios almost all requests are concentrated on a hot resource. Since the same resources use the same rate limiting key, they will continuously request the same redis shard, causing redis overload. In order not to affect business functions when redis is overloaded, we will let it go of the request, resulting in failure to achieve the expected rate limiting effect. And because the same rate limiting key always requests the same redis shard, we cannot improve system stability through distributed computing under the original implementation. I have carefully read the implementation of the code. By adding suffixes to the rate limiting key to obtain different hashtags, the requests can be dispersed to different redis shards to avoid single instance overload. In addition to avoiding single machine overload, since Lua scripts can be executed on different redis instances at the same time, this will greatly improve the parallelism of rate limiting. |
/** | ||
* A Round-Robin like index to select a virtual shard. | ||
*/ | ||
private final AtomicInteger shardIndex = new AtomicInteger(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why using Round-Robin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fair distribution logic of Round-Robin ensures that all shards have an equal probability of being utilized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could use a random start position to avoid the herd effect
Hello, @spencergibb @ryanjbaxter PTAL, thanks |
@@ -318,6 +355,9 @@ public static class Config { | |||
@Min(1) | |||
private int requestedTokens = 1; | |||
|
|||
@Min(0) | |||
private int shards = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a shard something that is known based on how Redis is configured? (Sorry I am not that familiar with Redis)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is different from what we usually refer to as sharding in storage systems. In fact, it is the concept of virtual sharding introduced at the application layer. As @jizhuozhi said, in the previous implementation, all hot resources would correspond to the same hot key. Due to the same key will fall to the same hash slot (https://redis.io/docs/latest/operate/oss_and_stack/reference/cluster-spec/#key-distribution-model). Therefore, requests are always made to the same instance. What we need to do is to split the rate-limiting resources at the application layer. We can split them into different keys in advance so that they can be placed on different instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a useful enhancement. It would need documentation as well as testing.
Signed-off-by: liufangzhou.aaa <liufangzhou.aaa@bytedance.com>
Looks like the tests you added are failing. |
Please accept my apologies for the typo. Due to network policy restrictions in my local environment, I was unable to validate the configuration through Docker testing, which inadvertently led to this oversight. |
Signed-off-by: liufangzhou.aaa <liufangzhou.aaa@bytedance.com>
@spencergibb pointed out to me that you could just use a redis cluster and it will automatically do sharding and send the request to the right node based on the key. That seems like a much better solution than trying to do the same thing inside the Gateway....unless I am missing some other kind of benefit. |
Our ultimate core is to avoid the situation where the same key always requests to the same redis shard, resulting in the inability to scale horizontally. The automatic sharding capability of the redis cluster does not reduce the load of redis. Because the same resources use the same rate-limiting key, even if the redis cluster has the automatic sharding capability, it will still request the same redis shards, resulting in redis overload, as shown in Figure 1. In this submission, by introducing the concept of virtual sharding at the application layer, the hot resources are divided into different keys in advance, and then different redis shards are requested to improve the parallelism of rate limiting and the stability of the system, as shown in Figure 2. |
I don't understand how this will work. Every request for a specific key could land on a different shard and therefore could have a different value set for tokens remaining. The same key has to retrieve the data from the same shard every time. |
Since I once had the same scenario as @Fangzhou1217, I can help him answer this question. Conclusion: the effect we want to achieve is that the same current limiting key (calculated by keyResolver) can request different redis shards instead of a single redis shard. I created an experiment to simulate the problem that this PR is going to solve:
@Bean
fun keyResolver(): KeyResolver {
return KeyResolver { exchange -> Mono.just("foobar") }
}
Both group A and B has the almost same throughput and latency
However, the redis cluster performance of the two groups of experiments is significantly different:
|
I don't see how the key from keyresolver can produce correct results (nothing to do with CPU) when requesting the same key from different shards |
It's from this PR (https://github.com/spring-cloud/spring-cloud-gateway/pull/3780/files#diff-dae08064f5cb79da1559dadb17abaab12d27407f498b79b611329c0cfa311c48R156) by adding shardId (taking effect as shuffling suffix). Since the ratelimiter just as counter, we could divide into different bucket to share the pressure. The key problem in this scenario is that the Redis instance used as a counter may become a bottleneck in the current limiting scenario. |
How can the counter return correct results for a given key across multiple shards? I say it can not. |
But in most scenarios, are the developers, who are facing the challenge that the Redis becoming a bottleneck in the current limiting really, caring about the reminding? So, as an enhancement, how about adding a new response header |
If the calculations are not correct, the feature will never get added. |
So how about creating a new class named It is still the same problem mentioned before. If Redis becomes a bottleneck and causes current limiting to be unavailable, then the overloaded request will not be rejected but will be directly passed to the business service. At this time, as a service developer, the primary concern is how the gateway can protect the backend service. |
I'm sure there is a timeout setting on the Redis client. Currently, if Redis fails, it allows the request to go through. Wouldn't that be a better solution than another rate limiter implementation? So far, it's only the two of you acknowledging this problem, and I don't want to introduce incorrect behavior. |
Yes, it allows the request to go through, but the Redis fails is caused by the very large burst requests far exceeds the capacity of single Redis shard (10x or 100x than backend service capacity). We want to make the rate limiter scalable horizontally, rather than overloading a single shard and sending requests directly to more fragile backend services (read/write database).
I can't assume whether other people need it or not, yes, this is indeed a scenario we encountered, and this PR can solve the problem we encountered. And as far as I know, a lot of business developers have encountered this problem, but they just use other methods to bypass it (such as turning on random discard)
I'm not sure which incorrect behavior introduced if we provide a new |
The incorrect behavior I've mentioned in many of my comments, that the number of tokens for a given key will be different across shards. |
You can certainly make your own |
Problem Statement:
The current implementation of RedisRateLimiter uses a single Redis key for rate limiting counters. This leads to a hot key problem in Redis clusters, where massive requests concentrate on a single Redis shard. The resulting excessive load on individual shards cannot be alleviated through horizontal cluster scaling, creating a performance bottleneck.
Proposed Solution:
This PR introduces a shards configuration parameter to address the hot key issue. Key enhancements include: