Skip to content

Conversation

barney-s
Copy link
Contributor

@barney-s barney-s commented May 17, 2025

feat: Create applylib for use by instance reconciler

  • Applylib is inspired from:

    • kubectl pkg/cmd/apply/applyset.go
    • kubebuilder-declarative-pattern/applylib
    • Creating a simpler, self-contained version of the library that is purpose built for controllers.
    • Handle pruning and uses server-side apply
  • RGD controller changes

    • fix linter warnings for func defn lines being too long
    • Inject restMapper to be passed to instance controllers
  • Instance controller changes

    • When patching instance status, use retry-on-conflict loop
    • Since we apply all resolved resources, Join() errors across
      resources (apply/update errors)
    • Add changes to get restMapper and pass it on to applylib
  • Instance reconciler changes

    • Refactor reconcileInstance to use Applylib
  • Makefile

    • Add target to setup/install envtest
    • Setup envtest, kubebuilder assets for running e2e tests locally
  • With SSA, we need to be careful about setting field manager

    • different reconciler paths use different managers
    • Introduce source-rgd labels to differenciate b/w parent rgd that
      creates an instance and the source rgd that defines the instance CRD
      • This is useful when we have an RGD creating and instance of
        another RG. Today we have a bug where the parent and the child
        reconcilers overwrite the labels.
  • (next steps) lifecycle support in applylib

    • redo externalRef as lifecycle hints
    • decorate lifecycle hint (create or update, no delete)

Related:

Tests:

From #290

Passes the Secret test case.

❯ go test -v ./test/e2e/suites/advanced/...
=== RUN   TestSecretsInResources
    using_secrets_test.go:27: Using namespace: e2e-secrets-in-resources-1747893649, from: testdata/../../../../test/e2e/testdata/secrets-in-resources
    testcase.go:193: Running step0: Install RGD and verify that the RGD is active
    testcase.go:359:    Applying ResourceGraphDefinition/deployment-with-secret-rgd[rgd.yaml]
    verify.go:57:       Verifying ResourceGraphDefinition/deployment-with-secret-rgd[0rgd.yaml]
    verify.go:57:       Verifying CustomResourceDefinition/deploymentwithsecrets.kro.run[1instancecrd.yaml]
    testcase.go:193: Running step1: Create instance DeploymentWithSecret with replicas=2 and tokenID=sometokenid and tokenSecret=kq4gihvszzgn1p0r
    testcase.go:359:    Applying DeploymentWithSecret/test-instance[instance.yaml]
    verify.go:57:       Verifying Deployment/test-instance[deployment.yaml]
    verify.go:57:       Verifying DeploymentWithSecret/test-instance[instance.yaml]
    verify.go:57:       Verifying Secret/test-instance-secret[secret.yaml]
    testcase.go:193: Running step2: Update instance DeploymentWithSecret with tokenID=newtokenid and tokenSecret=newtokensecret
    testcase.go:359:    Applying DeploymentWithSecret/test-instance[instance.yaml]
    verify.go:57:       Verifying Deployment/test-instance[deployment.yaml]
    verify.go:57:       Verifying DeploymentWithSecret/test-instance[instance.yaml]
    verify.go:57:       Verifying Secret/test-instance-secret[secret.yaml]
    testcase.go:134: Deleting namespace: e2e-secrets-in-resources-1747893649
    testcase.go:148: Deleting cluster-scoped object: ResourceGraphDefinition/deployment-with-secret-rgd
--- PASS: TestSecretsInResources (25.20s)
PASS
ok      github.com/kro-run/kro/test/e2e/suites/advanced 25.267s

Passes other e2e tests:

❯ go test -v ./test/e2e/suites/basic/...
=== RUN   TestAutoscaledDeploymentRGD
    multi_resource_test.go:38: Using namespace: e2e-autoscaled-deployment-rgd-1747894067, from: testdata/../../../../test/e2e/testdata/autoscaled-deployment-rgd
    testcase.go:193: Running step0: Install the RGD AutoscaledDeployment that creates a ConfigMap, Service, Deployment, and HPA 
    testcase.go:359:    Applying ResourceGraphDefinition/autoscaled-deployment-rgd[rgd.yaml]
    verify.go:57:       Verifying ResourceGraphDefinition/autoscaled-deployment-rgd[0rgd.yaml]
    verify.go:57:       Verifying CustomResourceDefinition/autoscaleddeployments.kro.run[1instancecrd.yaml]
    testcase.go:193: Running step1: Create an instance of AutoscaledDeployment with initial settings (version 1.24, 1 replica, 20% CPU utilization) 
    testcase.go:359:    Applying AutoscaledDeployment/test-app-instance[instance.yaml]
    verify.go:57:       Verifying ConfigMap/test-app-config[configmap.yaml]
    verify.go:57:       Verifying Deployment/test-app-deployment[deployment.yaml]
    verify.go:57:       Verifying HorizontalPodAutoscaler/test-app-hpa[hpa.yaml]
    verify.go:57:       Verifying AutoscaledDeployment/test-app-instance[instance.yaml]
    verify.go:57:       Verifying Service/test-app-service[service.yaml]
    testcase.go:193: Running step2: Update the instance to use a newer version (1.25) which should trigger a rolling update of the Deployment 
    testcase.go:359:    Applying AutoscaledDeployment/test-app-instance[instance.yaml]
    verify.go:57:       Verifying Deployment/test-app-deployment[deployment.yaml]
    testcase.go:193: Running step3: Update the instance to increase replicas to 3 and adjust HPA settings (min: 2, max: 5) 
    testcase.go:359:    Applying AutoscaledDeployment/test-app-instance[instance.yaml]
    verify.go:57:       Verifying Deployment/test-app-deployment[deployment.yaml]
    verify.go:57:       Verifying HorizontalPodAutoscaler/test-app-hpa[hpa.yaml]
    testcase.go:134: Deleting namespace: e2e-autoscaled-deployment-rgd-1747894067
    testcase.go:148: Deleting cluster-scoped object: ResourceGraphDefinition/autoscaled-deployment-rgd
--- PASS: TestAutoscaledDeploymentRGD (20.25s)
=== RUN   TestSimpleDeploymentRGD
    simple_deployment_test.go:27: Using namespace: e2e-simple-deployment-rgd-1747894087, from: testdata/../../../../test/e2e/testdata/simple-deployment-rgd
    testcase.go:193: Running step0: Install RGD and verify that the RGD is created
    testcase.go:359:    Applying ResourceGraphDefinition/simple-deployment-rgd[rgd.yaml]
    verify.go:57:       Verifying ResourceGraphDefinition/simple-deployment-rgd[0rgd.yaml]
    verify.go:57:       Verifying CustomResourceDefinition/simpledeployments.kro.run[1instancecrd.yaml]
    testcase.go:193: Running step1: Create instance SimpleDeployment with replicas=2
    testcase.go:359:    Applying SimpleDeployment/test-instance[instance.yaml]
    verify.go:57:       Verifying Deployment/test-instance[deployment.yaml]
    verify.go:57:       Verifying SimpleDeployment/test-instance[instance.yaml]
    testcase.go:193: Running step2: Update instance SimpleDeployment with replicas=3
    testcase.go:359:    Applying SimpleDeployment/test-instance[instance.yaml]
    verify.go:57:       Verifying Deployment/test-instance[deployment.yaml]
    testcase.go:134: Deleting namespace: e2e-simple-deployment-rgd-1747894087
    testcase.go:148: Deleting cluster-scoped object: ResourceGraphDefinition/simple-deployment-rgd
--- PASS: TestSimpleDeploymentRGD (15.16s)
PASS
ok      github.com/kro-run/kro/test/e2e/suites/basic    35.467s

@barney-s barney-s force-pushed the applyset branch 4 times, most recently from 38d805d to 3b3280f Compare May 20, 2025 02:36
@barney-s barney-s changed the title [WIP Not ready for review/merge]Feat: Adding applylib for use by kro reconciler. [WIP Not ready for review/merge]Feat: Create and use Applylib in instance reconciler May 20, 2025
@barney-s barney-s force-pushed the applyset branch 16 times, most recently from dadc8ca to fe7c298 Compare May 21, 2025 07:05
@barney-s barney-s changed the title [WIP Not ready for review/merge]Feat: Create and use Applylib in instance reconciler feat: Create applylib for use by instance reconciler May 21, 2025
@barney-s barney-s requested a review from a-hilaly May 21, 2025 07:06
@barney-s barney-s assigned matthchr and a-hilaly and unassigned justinsb May 21, 2025
@barney-s barney-s requested a review from matthchr May 21, 2025 07:06
@barney-s barney-s added this to the 0.4 milestone May 21, 2025
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Sep 17, 2025
@barney-s barney-s force-pushed the applyset branch 3 times, most recently from 245c976 to 472444e Compare September 17, 2025 17:17
@barney-s
Copy link
Contributor Author

1/ the callback mechanism complexity: ...

Removed the callback mechanism completely.

2/ Over abstraction: ...
Don't have an answer to this. Its more of a do we do this or not kind of Q.

3/ Parallelization: ...

Made the default behavior sequential.

I think we're trying to force fit a general applyset pattern into kro and we should consider simplifying this to focus on the core needs: apply resources, track what was applied, and prune what's no longer needed - without the additional layers of abstraction.

We can improve on this. We can decide to do our own applyset implementation that has less LOC than a general applyset lib.
This PR is not precluding that optimization in future.

PTAL again @a-hilaly

@barney-s
Copy link
Contributor Author

@a-hilaly - Would you please take a look and approve ?

Copy link
Member

@a-hilaly a-hilaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@barney-s Thank you for the persistence on this PR and addressing the feedback through multiple iterations. The work here is significant - implementing ApplySet for kro is a major architectural change and a positive one.

This is ready to merge. We can address any remaining items in follow-up PRs. Thank you again Barni! I've left some questions in the review below but these are minor compared to the overall value this brings.

gvr,
processedRGD,
r.clientSet,
r.clientSet.RESTMapper(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to team, we need to use more RESTMappers accross the code base.

@a-hilaly
Copy link
Member

a-hilaly commented Sep 25, 2025

@barney-s can you please squash the last 3 commits?

@a-hilaly a-hilaly added the tide/merge-method-merge Denotes a PR that should use a standard merge by tide when it merges. label Sep 25, 2025
@barney-s barney-s force-pushed the applyset branch 4 times, most recently from da865e2 to 605ff0b Compare September 25, 2025 22:11
* Applylib is inspired from:
  * kubectl pkg/cmd/apply/applyset.go
  * kubebuilder-declarative-pattern/applylib
  * Creating a simpler, self-contained version of the library that is purpose built for controllers.
  * Handle pruning and uses server-side apply
* RGD controller changes
  * fix linter warnings for func defn lines being too long
  * Inject restMapper to be passed to instance controllers
* Instance controller changes
  * When patching instance status, use retry-on-conflict loop
  * Since we apply all resolved resources, Join() errors across
    resources (apply/update errors)
  * Add changes to get restMapper and pass it on to applylib
* Instance reconciler changes
  * Refactor reconcileInstance to use Applylib
* Makefile
  * Add target to setup/install envtest
  * Setup envtest, kubebuilder assets for running e2e tests locally
* Remove pkg/controller/instance/delta since we dont compute delta
  locally. We defer it to the server.
* With SSA, we need to be careful about setting field manager
  * different reconciler paths use different managers
  * Introduce source-rgd labels to differenciate b/w parent rgd that
    creates an instance and the source rgd that defines the instance CRD
    * This is useful when we have an RGD creating and instance of
      another RG. Today we have a bug where the parent and the child
      reconcilers overwrite the labels.
* (next steps) lifecycle support in applylib
  * redo externalRef as lifecycle hints
  * decorate lifecycle hint (create or update, no delete)

-----------------------

Addressing review comments:

* Move applyset/applylib -> pkg/applylib
* Add more docs
  * Add a reference to the applyset KEP in comments
  * Add docs.go and interface docs
* Improvements:
  * remove caching a json marshalled copy of obj
  * Remove name/namespace in PruneObject and instead derive it from object. Also change object type from runtime.Object to unstructured.
  * use fieldmanager config param to remove kro hardcoding in applyset
  * Aggregate pruneErrors and applyErrors into single Errors()
  * add recordPruneObject helper
  * switched from waitGroup to errgroup when collecting objects to prune across namespaces
  * take logr as a config in ApplySet. Added more logging in applyset.
  * use superset of namespaces and GKs we compute before apply for pruning
  * Add simple tests in the applyset pkg
  * use namespace-scoped resource interface when pruning
  * use force=True when applying in setManaged()
  * Removed un-necessary deepcopy in updateParentLabelsAndAnnotations
  * Add Unit tests to verify pruning and apply logic
  * Fix bugs in pruning logic that missed old namespaces and GKs
  * Remove callbacks and make code more procedural
  * Make pruning sequential by default
* Refactors:
  * add RestMapper to the client-set
  * use Unstrucutred in ApplyableObject instead of interface and forcing it to be unstructured in code
  * minor restruction in processLoad() method
  * Refactor code that was using errors.Join and an array
  * Use ContainsFinalizer
  * Converting unstructured to PartialObjectMetadata for parent in applyset.
* Cleanups:
  * Fix typos in comments, prints etc
  * Remove obseleted comments
  * Make the exported resources strutter less, applyset.ApplySet -> applyset.Set, NewApplySet -> New etc.
  * Fix captilaization for var names. Uid -> UID, ...
  * use consts for labelManager strings
  * remove pure formatting changes in unrelated files to minimize PR changeset
Split out:
  * Remove pkg/metadata/finalizer.go changes and create a separate PR kubernetes-sigs#625
  * move envtest makefile changes to kubernetes-sigs#636
  * removing labeller changes and moving them to kubernetes-sigs#631
  * Fix regression when removing labeller changes:
    `dynamic-controller      Error syncing item, requeuing with rate limit   {"item": {"NamespacedKey":"chainsaw-special-quail/test-instance","GVR":{"Group":"kro.run","Version":"v1alpha1","Resource":"checkinstancecreationsimpledeployments"}}, "error": "failed to setup instance: failed to set finalizer: error getting finalizers: .metadata.finalizers accessor error: [] is of the type []string, expected []interface{}"}`
@barney-s
Copy link
Contributor Author

@barney-s Thank you for the persistence on this PR and addressing the feedback through multiple iterations. The work here is significant - implementing ApplySet for kro is a major architectural change and a positive one.

Appreciate all the feedback and reviews.

This is ready to merge. We can address any remaining items in follow-up PRs. Thank you again Barni! I've left some questions in the review below but these are minor compared to the overall value this brings.

Addressed most of the comments/Qs.

@barney-s
Copy link
Contributor Author

@barney-s can you please squash the last 3 commits?

Done.

Copy link
Member

@a-hilaly a-hilaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 🚀
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 25, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: a-hilaly, barney-s, jakobmoellerdev, matthchr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit c2fce51 into kubernetes-sigs:main Sep 25, 2025
8 checks passed
@allamand
Copy link
Contributor

Nice work all 🚀

@barney-s barney-s mentioned this pull request Sep 26, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-merge Denotes a PR that should use a standard merge by tide when it merges.
Projects
None yet
10 participants