MachineSetPreflightChecks + maxSurge zero results in MDs downscaled for a time longer than desirable

### What would you like to be added (User Story)?

As a user, when I use maxSurge zero, I would like the time where my MachineDeploymenst are downscaled to be as short as possible, during a cluster Kubernetes upgrade -- today they are downscaled at once when the MD is updated and this remains true during the whole duration of the control plane nodes rolling update.

The evolution that would be desirable would be a behavior where, when maxSurge zero is used, the **old ReplicaSet for a MachineDeployment is not downscaled until the new ReplicaSet will be able to be upscaled** (based on MachineSetPreflightChecks).   

Said otherwise:
* with current 1.9.x code, an MD is not at its target number of replicas for a duration which is: time to update CP nodes + time to update the MD nodes
* ideally, an MD would not be at its target number of replicas for a duration which would just be the time to update the MD nodes





### Detailed Description

## Example Scenario

Let's consider the following example scenario:
* MachineSetPreflightChecks are enabled (they have been enabled by default since 1.9.x)
* one or more MachineDeployments are used
    * they use RollingUpdate strategy with **maxSurge is set to zero**
* CAPI resources for a cluster are updated to trigger a **Kubernetes version upgrade** (CP and MDs) 

## Current behavior

What happens today is the following:
* a CP node rolling update is triggered and starts
* at once, for all MDs, a new ReplicaSet is created, and the previous one is downscaled (maxSurge 0)
* at this point all MDs are downscaled, and this will persist until the CP node rolling update is finished (which on baremetal isn't quick, ~1h being a quite typical order of magnitude -- 3 nodes, each rebuild in 20 minutes)

## Problem statement

**With maxSurge zero, during a Kubernetes upgrade, all MDs will be downscaled by one for a time longer than desirable (during the time needed to roll'update all the CP nodes), while ideally the MDs could remain untouched during the CP nodes rolling update.**

The evolution that would be desirable would be a behavior where the **old ReplicaSet is not downscaled until the new ReplicaSet will be able to be upscaled**.

## Relevance

The scenario above is common place for CAPI baremetal deployments (e.g. with capm3) where it is common to not have spare hardware.   In baremetal low-footprint scenarios where clusters have a low number of nodes, the difference in terms of available processing resources can be significant.

## Note

I opted for filing this as a "feature request", but my feeling is that some might qualify the current behavior as a regression. Please feel free to requalify as a "bug report" if you think this is deserved.


### Anything else you would like to add?

_No response_

### Label(s) to be applied

/kind feature
/area machinedeployment
/area upgrades

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MachineSetPreflightChecks + maxSurge zero results in MDs downscaled for a time longer than desirable #12187

What would you like to be added (User Story)?

Detailed Description

Example Scenario

Current behavior

Problem statement

Relevance

Note

Anything else you would like to add?

Label(s) to be applied

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MachineSetPreflightChecks + maxSurge zero results in MDs downscaled for a time longer than desirable #12187

Description

What would you like to be added (User Story)?

Detailed Description

Example Scenario

Current behavior

Problem statement

Relevance

Note

Anything else you would like to add?

Label(s) to be applied

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions