Skip to content

feat(operator): add ACL watcher sidecar image#478

Open
rkthtrifork wants to merge 5 commits intodragonflydb:mainfrom
rkthtrifork:rkth/acl-watcher-sidecar
Open

feat(operator): add ACL watcher sidecar image#478
rkthtrifork wants to merge 5 commits intodragonflydb:mainfrom
rkthtrifork:rkth/acl-watcher-sidecar

Conversation

@rkthtrifork
Copy link
Copy Markdown

@rkthtrifork rkthtrifork commented Mar 9, 2026

Summary

  • Add a tiny ACL watcher sidecar image (inotify + redis-cli) to reload ACLs when the mounted file changes.
  • Wire the sidecar into Dragonfly pods when spec.aclFromSecret is configured.
  • Add Makefile targets and CI workflow steps to build/publish the ACL watcher image.

Motivation

Mounted Secret updates are eventually consistent. The sidecar watches the ACL file and runs ACL LOAD after the file actually changes.

Notes

  • spec.authentication.passwordFromSecret is not automatically updated since its inject through an environment variable. It could make sense to document this and recommend use aclFromSecret if possible.

Testing

  • Not done. I would like to confirm interest in this approach before investing time in testing.

Issue

#479

@rkthtrifork
Copy link
Copy Markdown
Author

rkthtrifork commented Mar 9, 2026

Alternative implementation: small Go binary that does the same and runs in a distroless container for improved security. I have already implemented that locally and would like to use it if you agree it is sensible

EDIT: I have done this because i think its more production ready and it was a fairly small change

@Abhra303
Copy link
Copy Markdown
Contributor

Hi @rkthtrifork, having a separate acl watcher sidecar sounds like overdoing. Have you considered extending the reconciler to do so?

@rkthtrifork
Copy link
Copy Markdown
Author

Hi @rkthtrifork, having a separate acl watcher sidecar sounds like overdoing. Have you considered extending the reconciler to do so?

I considered it, but the mounted ACL file is updated asynchronously by kubernetes. If the controller watches the secret then it will trigger immediately which will usually be before the mounted ACL file has updated.

Alternatively, the controller should watch the secret and apply the changes through the dragonfly API, but then its not the mounted file thats the source of truth anymore which creates the problem of drift. Im not a fan of that.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an ACL-watcher sidecar (packaged as its own image) and injects it into Dragonfly pods when spec.aclFromSecret is set, so ACLs can be reloaded on mounted Secret file updates.

Changes:

  • Introduces a new acl-watcher Go module + container image build (fsnotify + Redis ACL LOAD).
  • Wires the watcher sidecar into the generated StatefulSet when spec.aclFromSecret is configured.
  • Extends Makefile and CI workflow to build/publish the watcher image.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
Makefile Adds build/push/buildx targets for the ACL watcher image.
internal/resources/resources.go Injects ACL watcher sidecar + env/mounts when aclFromSecret is used.
internal/resources/const.go Adds default image/name constants for the sidecar.
acl-watcher/main.go New watcher binary: watches Secret volume changes and runs ACL LOAD.
acl-watcher/go.mod New nested Go module for the watcher.
acl-watcher/go.sum Dependency lock for the watcher module.
acl-watcher/Dockerfile Builds a minimal distroless watcher image.
.github/workflows/ci.yml Builds ACL watcher in PR CI; publishes it on tag releases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +67 to +68
AclWatcherImage = "ghcr.io/dragonflydb/acl-watcher:latest"
AclWatcherContainerName = "acl-watcher"
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default ACL watcher image is pinned to the mutable :latest tag. This makes rollouts non-reproducible and also doesn’t align with the CI workflow, which publishes versioned/tagged images. Consider defaulting to a versioned tag (e.g., .../acl-watcher:${Version}) and/or making the sidecar image configurable so clusters don’t implicitly track latest.

Suggested change
AclWatcherImage = "ghcr.io/dragonflydb/acl-watcher:latest"
AclWatcherContainerName = "acl-watcher"
AclWatcherImageRepository = "ghcr.io/dragonflydb/acl-watcher"
AclWatcherImageTag = "v1.0.0"
AclWatcherImage = AclWatcherImageRepository + ":" + AclWatcherImageTag
AclWatcherContainerName = "acl-watcher"

Copilot uses AI. Check for mistakes.
statefulset.Spec.Template.Spec.Containers = append(statefulset.Spec.Template.Spec.Containers, corev1.Container{
Name: AclWatcherContainerName,
Image: AclWatcherImage,
ImagePullPolicy: corev1.PullIfNotPresent,
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sidecar uses AclWatcherImage (currently :latest) with ImagePullPolicy: PullIfNotPresent. With a mutable tag this can easily leave nodes running an old cached watcher image, and makes behavior diverge across nodes. Prefer using a versioned tag (or a configurable image) and/or adjusting pull policy accordingly so upgrades are deterministic.

Suggested change
ImagePullPolicy: corev1.PullIfNotPresent,
ImagePullPolicy: corev1.PullAlways,

Copilot uses AI. Check for mistakes.
Comment thread acl-watcher/Dockerfile
Comment on lines +12 to +13
RUN CGO_ENABLED=0 GOOS=$TARGETOS GOARCH=$TARGETARCH \\
go build -trimpath -ldflags="-s -w" -o /out/acl-watcher ./main.go
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build sets GOOS=$TARGETOS GOARCH=$TARGETARCH without fallbacks. When building with plain docker build (e.g., via make docker-build-acl-watcher), these build args may be empty depending on builder settings, which can break the build or make it less portable. Consider mirroring the root Dockerfile pattern (e.g., defaulting GOOS to linux and letting GOARCH default to the build platform when unset).

Suggested change
RUN CGO_ENABLED=0 GOOS=$TARGETOS GOARCH=$TARGETARCH \\
go build -trimpath -ldflags="-s -w" -o /out/acl-watcher ./main.go
RUN if [ -n "$TARGETARCH" ]; then \\
CGO_ENABLED=0 GOOS="${TARGETOS:-linux}" GOARCH="$TARGETARCH" \\
go build -trimpath -ldflags="-s -w" -o /out/acl-watcher ./main.go; \\
else \\
CGO_ENABLED=0 GOOS="${TARGETOS:-linux}" \\
go build -trimpath -ldflags="-s -w" -o /out/acl-watcher ./main.go; \\
fi

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/ci.yml
DOCKER_PASSWORD: ${{ secrets.GITHUB_TOKEN }}

- name: Build and Publish ACL watcher image into GHCR
uses: docker/build-push-action@v6
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow uses docker/build-push-action@v7 for the operator image but @v6 for the ACL watcher image. Using different major versions for the same action in one workflow increases maintenance burden and can lead to subtle behavior differences; consider standardizing on a single version.

Suggested change
uses: docker/build-push-action@v6
uses: docker/build-push-action@v7

Copilot uses AI. Check for mistakes.
Comment thread acl-watcher/main.go
Comment on lines +100 to +110
func isACLRelated(path, aclBase string) bool {
name := filepath.Base(path)
if name == aclBase {
return true
}
// Secret updates use ..data symlink swaps.
if strings.HasPrefix(name, "..") {
return true
}
return false
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acl-watcher is a new binary with non-trivial behavior (fsnotify event filtering + debouncing + Redis command execution), but it’s introduced without any automated tests. Adding at least small unit tests for isACLRelated and the debouncing/reload triggering logic would help prevent regressions (especially around Kubernetes Secret ..data update semantics).

Copilot uses AI. Check for mistakes.
@Abhra303
Copy link
Copy Markdown
Contributor

Abhra303 commented Apr 3, 2026

Alternatively, the controller should watch the secret and apply the changes through the dragonfly API, but then its not the mounted file thats the source of truth anymore which creates the problem of drift. Im not a fan of that.

Its not an issue to acl load when file is changed. It is a much simpler and better solution of a rare operation. The problem with the sidecar approach is that (1) we have to maintain another image along with the operator. (2) It will run continuously looking for changes even though acl changes are rare. (3) Adds a new sidecar for every dragonfly deployment. Its a overdoing.

With the controller, it is much easier to maintain and I don't see any downside of this approach.

@Abhra303
Copy link
Copy Markdown
Contributor

Abhra303 commented Apr 3, 2026

Can you update the PR to support operator driven reload instead?

@rkthtrifork
Copy link
Copy Markdown
Author

rkthtrifork commented Apr 13, 2026

Alternatively, the controller should watch the secret and apply the changes through the dragonfly API, but then its not the mounted file thats the source of truth anymore which creates the problem of drift. Im not a fan of that.

Its not an issue to acl load when file is changed. It is a much simpler and better solution of a rare operation. The problem with the sidecar approach is that (1) we have to maintain another image along with the operator. (2) It will run continuously looking for changes even though acl changes are rare. (3) Adds a new sidecar for every dragonfly deployment. Its a overdoing.

With the controller, it is much easier to maintain and I don't see any downside of this approach.

I don't believe watching the Secret in the operator is sufficient on its own, because the Secret update event and the mounted file update are not synchronized. The operator can observe the new Secret object before kubelet has refreshed the projected secret volume in the Dragonfly pods. If ACL LOAD is issued at that point, Dragonfly may reload the previous ACL file contents rather than the updated ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants