Skip to content

Context cancellation causes a goroutine leak (possibly intermittent) #3218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pmalek opened this issue May 15, 2025 · 0 comments
Open

Context cancellation causes a goroutine leak (possibly intermittent) #3218

pmalek opened this issue May 15, 2025 · 0 comments

Comments

@pmalek
Copy link

pmalek commented May 15, 2025

Problem statement

When running our integration tests suite with https://github.com/uber-go/goleak, it sometimes reports a goroutine leak in:

 manager_envtest_test.go:84: found unexpected goroutines:
          [Goroutine 12886 in state sync.WaitGroup.Wait, with sync.runtime_SemacquireWaitGroup on top of the stack:
          sync.runtime_SemacquireWaitGroup(0xc000cdc830?)
          	/home/runner/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.1.linux-amd64/src/runtime/sema.go:110 +0x25
          sync.(*WaitGroup).Wait(0xc000cdc830)
          	/home/runner/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.1.linux-amd64/src/sync/waitgroup.go:118 +0x89
          sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func3.(*runnableGroup).StopAndWait.3.2()
          	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/manager/runnable_group.go:307 +0x99
          created by sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func3.(*runnableGroup).StopAndWait.3 in goroutine 12666
          	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/manager/runnable_group.go:304 +0x213
           Goroutine 12791 in state chan send, with sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1 on top of the stack:
          sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1(0xc00115d940)
          	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/manager/runnable_group.go:227 +0x1f8
          created by sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile in goroutine 12615
          	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/manager/runnable_group.go:210 +0x245
           Goroutine 12838 in state chan send, with sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1 on top of the stack:
          sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile.func1(0xc00115dd00)
          	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/manager/runnable_group.go:227 +0x1f8
          created by sigs.k8s.io/controller-runtime/pkg/manager.(*runnableGroup).reconcile in goroutine 12615
          	/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.20.4/pkg/manager/runnable_group.go:210 +0x245
          ]

After quick glance at the code, it would seem that controllerManager passes its errChan to its runnableGroups.

When runnables fail, they propagate the error through that error channel in

.

If I'm looking at this correctly, when context is done, the errChan is not drained in

case <-ctx.Done():
// We are done
return nil
and thus runnable groups cannot send their errors in .

Proposed solution

Drain the cm.errChan in case <-ctx.Done() branch at

select {
case <-ctx.Done():
// We are done
return nil
case err := <-cm.errChan:
// Error starting or running a runnable
return err
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant