-
Notifications
You must be signed in to change notification settings - Fork 1.4k
📖 Update autoscaling from zero enhancement proposal with support for platform-aware autoscale from zero #11962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @aleskandro! |
Hi @aleskandro. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
4fbe927
to
666c952
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems very straightforward, no objection from me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this generally makes sense to me, and i don't have any objections.
i would like to get some attention on the API field change though, just to make sure folks agree about the name and placement.
Thanks @elmiko. To give more context about the choice of naming and structure for everyone to discuss alternatives. I couldn't consider the architecture as part of the capacity field because the values of keys in that stanza are expected to be a Quantity. I tried to trace back the reasoning the community did about the capacity field and I noticed that the Node objects' status field has the same field - capacity - with the same content we set in the Infrastructure Machine Template objects. The Node objects' status field also has a nodeInfo field with the same name and type that I'm now proposing for the Infrastructure Machine Template objects. Among the alternatives I was evaluating, this seemed the most reasonable to me. |
makes sense to me @aleskandro , no objection from me. |
666c952
to
106bcbd
Compare
/hold Especially for getting consensus on and resolve #11962 (comment) |
106bcbd
to
20f9c76
Compare
d6e69f8
to
85da18e
Compare
@JoelSpeed If you find some time, PTAL :) |
8349163
to
6d2fb96
Compare
/ok-to-test |
lgtm from my side. Pending resolution of #11962 (comment) (I'll be on PTO for the next few weeks, but fine for me if there is consensus with other maintainers) |
/label tide/merge-method-squash |
i am happy with the state this is in. /lgtm |
LGTM label has been added. Git tree hash: 1e88b9ce31402b9352a67b7a6637e488da7436ac
|
/unhold |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few nits from my side.
Considering the type of changes we are introducing, I think we should consider addressing #12033 as integral part of the implementation.
Last nit, could you kindly also add a new line in the Implementation History down below
) | ||
|
||
// NodeInfo contains information about the node's architecture and operating system. | ||
type NodeInfo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to use NodeSystemInfo, so there is a more direct link with the corresponding struct in corev1 types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fabriziopandini, the previous version was using NodeSystemInfo. I changed it due to #11962 (comment). cc @sbueringer
I'm ok with either way as far as we have consensus :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would favor NodeInfo because I think the field and struct name should be the same if there is no good reason to diverge. And I think the field name should be the same as in upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with both options, but the consistency with corev1.Node in this case IMO could help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with both your points, but leaning a bit more into @fabriziopandini's option to be consistent with corev1.Node.
@@ -246,6 +297,8 @@ metadata: | |||
capacity.cluster-autoscaler.kubernetes.io/taints: "key1=value1:NoSchedule,key2=value2:NoExecute" | |||
``` | |||
|
|||
If the `capacity.cluster-autoscaler.kubernetes.io/labels` annotation specifies a label that would otherwise be generated from the `capacity` or `node-info` annotations, the autoscaler will use the label defined in `capacity.cluster-autoscaler.kubernetes.io/labels`, overriding any labels produced by processing the other annotations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH i struggle to follow, might be because this sentence enters into autoscaler implementation details.
What about simplifying by writing what takes precedence between capacity
or node-info
on the template vs annotations on MachineSet or MachineDeployment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to
The
capacity.cluster-autoscaler.kubernetes.io/labels
annotation takes precedence over other sources when conflicts occur.
For instance, if thekubernetes.io/arch
label is specified incapacity.cluster-autoscaler.kubernetes.io/labels
, its value supersedes that specified bynode-info.cluster-autoscaler.kubernetes.io/architecture
.
New changes are detected. LGTM label has been removed. |
6758a68
to
31e4d30
Compare
Hi @fabriziopandini , thanks for your review. I could take #12033 and resolve it in a separate PR, if this sounds good to you.
Done |
…atform-aware autoscale from zero This commit updates the contract between the cluster-autoscaler Cluster API provider and the infrastructure provider's controllers that reconcile the Infrastructure Machine Template to support platform-aware autoscale from 0 in clusters consisting of nodes heterogeneous in CPU architecture and OS. With this commit, the infrastructure providers implementing controllers to reconcile the status of their Infrastructure Machine Templates for supporting autoscale from 0 will be able to fill the status.nodeInfo stanza with additional information about the nodes. The status.nodeInfo stanza has type corev1.NodeSystemInfo to reflect the same content, the rendered nodes' objects would store in their status field. The cluster-autoscaler can use that information to build the node template labels `kubernetes.io/arch` and `kubernetes.io/os` if that information is present. Suppose the pending pods that trigger the cluster autoscaler have a node selector or a requiredDuringSchedulingIgnoredDuringExecution node affinity concerning the architecture or operating system of the node where they can execute. In that case, the autoscaler will be able to filter the nodes groups options according to the architecture or operating system requested by the pod. The users could already provide this information to the cluster autoscaler through the labels capacity annotation. However, there is no similar capability to support future labels/taints through information set by the reconcilers of the status of Infrastructure Machine Templates.
31e4d30
to
d0cb4af
Compare
Its definition would look like this: | ||
|
||
```go | ||
package platform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to double check, are we sure we want to expose those types in a non-versioned package?
Also, given this is going to be part of the contract, shouldn't we use a package name that enforce this link (and that we can eventually expand with other contract related stuff)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were considering not to have these types in the cluster api repo, but delegating implementations to define their based on the contract defined here
The `capacity.cluster-autoscaler.kubernetes.io/labels` annotation takes precedence over other sources when conflicts occur. | ||
For instance, if the `kubernetes.io/arch` label is specified in `capacity.cluster-autoscaler.kubernetes.io/labels`, its value supersedes that specified by `node-info.cluster-autoscaler.kubernetes.io/architecture`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting back to this because I'm still a little bit confused.
I was assuming that capacity.cluster-autoscaler.kubernetes.io and node-info.cluster-autoscaler.kubernetes.io serve a different pourpuse, so value cannot overlap (the example with both reporting arch is confusing).
Instead what can happen IMO is that each annotation overlaps with the corresponding struct in templates (capacity.cluster-autoscaler.kubernetes.io with status.capacity and node-info.cluster-autoscaler.kubernetes.io with status.nodeIndfo), and I was expecting this paragraph to define which of the two takes precedence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue here is that node-info and status.node(System)?Info will render as labels for the scheduler simulator in the autoscaler side. So either we deny capacity labels to specify labels like kubernetes.io/arch or we have to define which one takes precedence over the other.
The autoscaler and scheduler do not use status.nodeSystemInfo directly, but obey labels for pods that specify predicates for them
What this PR does / why we need it:
This PR updates the contract between the cluster-autoscaler Cluster API provider and the infrastructure provider's controllers that reconcile the Infrastructure Machine Template to support platform-aware autoscale from 0 in clusters consisting of nodes heterogeneous in CPU architecture and OS.
With this commit, the infrastructure providers implementing controllers to reconcile the status of their Infrastructure Machine Templates for supporting autoscale from 0 will be able to fill the status.nodeInfo stanza with additional information about the nodes.
The status.nodeInfo stanza has type corev1.NodeSystemInfo to reflect the same content, the rendered nodes' objects would store in their status field.
The cluster-autoscaler can use that information to build the node template labels
kubernetes.io/arch
andkubernetes.io/os
if that information is present.Suppose the pending pods that trigger the cluster autoscaler have a node selector or a requiredDuringSchedulingIgnoredDuringExecution node affinity concerning the architecture or operating system of the node where they can execute. In that case, the autoscaler will be able to filter the nodes groups options according to the architecture or operating system requested by the pod.
The users could already provide this information to the cluster autoscaler through the labels capacity annotation. However, there is no similar capability to support future labels/taints through information set by the reconcilers of the status of Infrastructure Machine Templates.
Which issue(s) this PR fixes
Related to #11961
/area provider/core