Skip to content

FederatedResourceQuota should be failover friendly #5179

@mszacillo

Description

@mszacillo

What would you like to be added:
A way for the FederatedResourceQuota to monitoring existing ResourceQuotas (without managing those ResourceQuotas) and impose resource limits on the user based off the sum of all currently used Quota.

Why is this needed:
The existing FederatedResourceQuota mirrors the behavior of a typical Kubernetes ResourceQuota by imposing total resource limits in a multi-cluster setup. This is done by creating statically distributed ResourceQuotas across the specified member clusters, who's limits will total the limits defined in the FederatedResourceQuota. This works if the user does not need to worry about DR events which require back-up resources dedicated for failover in the event of a disaster.

In our case, since we are using Karmada for it's failover feature, we would like clusters to have additionally available quota for each namespace so that in the event of a DR event, all applicatons are able to be rescheduled:

Screenshot 2024-07-11 at 2 49 14 PM

In the diagram above, we can see that the total limits of the FederatedResourceQuota is 40/40 CPU and 50Gb / 50 Gb Memory. Individual clusters will have the same limit, so that in the case of a DR event, all workloads can be scheduled on one cluster.

Screenshot 2024-07-11 at 2 54 42 PM

Above, we see that during a failover all workloads from Cluster A will be migrated to Cluster B, where there will be enough available resources to schedule all required pods. With the existing statically defined ResourceQuotas, we cannot support this type of failover.

We've created this ticket to start a discussion on how best to address this limitation, and if this use-case is valid.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    Status

    Planning

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions