You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cadence has been using consistent hashing based host-shard management. However this logic comes up with significant drawbacks which come to surface at high scale. Issues include:
Shard Load Skew
Shard Data Payload Skew
Heterogeneous Host SKUs
Shared Hosts
Individual Host Problems
Shutdowns, Restarts, Host Turn-up
and the list goes on.
With consistent hashing, shards are pinned to hosts. When a host is having a problem due to the factors listed above, adding a new host doesn't guarantee taking load from the problematic host. Similarly, removing a host might just shift the issues to the next host. Even though Cadence uses and recommends using many hosts in a cluster (~16K), these issues would still happen.
A host, as long as it reports itself as healthy to the membership protocol, could get traffic even if it was shutting down or just starting up.
With a high frequency, high availability environments, this logic poses an availability risk. Cadence will implement a load aware and smarter Shard Manager that can move shards among hosts to uniformly distribute the traffic and gracefully handle host maintenance events.
The text was updated successfully, but these errors were encountered:
Cadence has been using consistent hashing based host-shard management. However this logic comes up with significant drawbacks which come to surface at high scale. Issues include:
and the list goes on.
With consistent hashing, shards are pinned to hosts. When a host is having a problem due to the factors listed above, adding a new host doesn't guarantee taking load from the problematic host. Similarly, removing a host might just shift the issues to the next host. Even though Cadence uses and recommends using many hosts in a cluster (~16K), these issues would still happen.
A host, as long as it reports itself as healthy to the membership protocol, could get traffic even if it was shutting down or just starting up.
With a high frequency, high availability environments, this logic poses an availability risk. Cadence will implement a load aware and smarter Shard Manager that can move shards among hosts to uniformly distribute the traffic and gracefully handle host maintenance events.
The text was updated successfully, but these errors were encountered: