Skip to content

networkd: stop all socket units to prevent socket activation restart#613

Open
miharp wants to merge 1 commit intovoxpupuli:masterfrom
miharp:fix-networkd-socket-activation
Open

networkd: stop all socket units to prevent socket activation restart#613
miharp wants to merge 1 commit intovoxpupuli:masterfrom
miharp:fix-networkd-socket-activation

Conversation

@miharp
Copy link
Copy Markdown

@miharp miharp commented Apr 9, 2026

Pull Request (PR) description

On systemd 260+ (Archlinux rolling), additional socket units
(systemd-networkd-varlink.socket, systemd-networkd-varlink-metrics.socket,
systemd-networkd-resolve-hook.socket) cause systemd to immediately re-activate
systemd-networkd.service via socket activation when it is stopped, making
networkd_ensure => stopped non-idempotent.

Add a systemd_networkd_socket_units fact that discovers all
systemd-networkd*.socket unit files present on the node. The manifest
iterates over them, stopping all socket units before stopping the service.
This handles the current set of sockets as well as any added in future
systemd versions.

On platforms where systemd-networkd is not installed (RHEL/CentOS) the fact
returns an empty array and no socket resources are declared.

This Pull Request (PR) fixes the following issues

Fixes #612

@miharp miharp mentioned this pull request Apr 9, 2026
@trefzer
Copy link
Copy Markdown
Contributor

trefzer commented Apr 11, 2026

Using a fact for getting the remaining open sockets, will not work since facts computation is done asynchronous from the puppet run. A function is needed for this.

Comment thread manifests/networkd.pp
@miharp miharp force-pushed the fix-networkd-socket-activation branch 2 times, most recently from 91c8077 to 0822e44 Compare April 11, 2026 13:58
@miharp
Copy link
Copy Markdown
Author

miharp commented Apr 12, 2026

Thanks for the review @trefzer!

To clarify the intent: the fact queries `systemctl list-unit-files` (unit file existence on disk) rather than `list-units` (runtime active state). Unit files are stable — they should not appear or disappear during a puppet run under normal circumstances.

I guess this could be a problem when puppet installs `systemd-networkd` during the same run on a platform where it's a separate package (e.g. Debian). In that case the socket unit files wouldn't exist at fact-collection time, so no socket resources would be declared. However, on those platforms (Debian/Ubuntu with systemd < 256) there is only `systemd-networkd.socket` and it doesn't trigger the restart behaviour we're fixing — I think the problem is specific to systemd 260+ on Archlinux, where networkd is always pre-installed as part of the base system.

That said, if you think a function is still the right approach here to be more robust, happy to discuss. Are you thinking of using a custom Puppet function that calls `systemctl` at catalog application time rather than collection time?"

@miharp miharp force-pushed the fix-networkd-socket-activation branch from 0822e44 to 1483c45 Compare April 12, 2026 14:33
Comment thread lib/facter/systemd.rb
@trefzer
Copy link
Copy Markdown
Contributor

trefzer commented Apr 12, 2026

Yes I definitly think that a function is a better approach than a fact.
But function is probably the wrong term, since functions are running on the puppetserver ;(. So I only see implementing it via type and provider (which I agree is not so nice).

I disagree, that it's only an Archlinux problem since Archlinux is now the first using systemd 260, but others will follow ! So implementing a generic solution should be the goal.

(Sorry I rewrote know my whole reply, since function is on server !)

@miharp
Copy link
Copy Markdown
Author

miharp commented Apr 13, 2026

Local Testing

Platform systemd versions

Distro systemd version Socket units
Archlinux (rolling) 260+ systemd-networkd.socket, systemd-networkd-varlink.socket, systemd-networkd-varlink-metrics.socket, systemd-networkd-resolve-hook.socket
openSUSE Tumbleweed ~257
Fedora 42 257
CentOS Stream 10 257
Ubuntu 24.04 LTS 255 systemd-networkd.socket
Ubuntu 24.10 ~256
Debian testing ~257
RHEL/CentOS Stream 9 252

Archlinux is currently the only mainstream distro shipping systemd 260+.

Archlinux / systemd 260+ (the bug platform)

Confirmed in an Archlinux container running systemd 260:

$ facter systemd_networkd_socket_units
[
  "systemd-networkd-resolve-hook.socket",
  "systemd-networkd-varlink-metrics.socket",
  "systemd-networkd-varlink.socket",
  "systemd-networkd.socket"
]

Without the fix, stopping systemd-networkd and then running networkctl status immediately restarts the service via socket activation. With the fix (all socket units stopped first), networkctl status returns Connection refused and the service stays stopped.

Ubuntu 24.04 agent / systemd 255 (Vagrant)

Socket units on this platform:

systemd-networkd.socket   disabled   enabled

First puppet run (networkd_ensure => stopped, service was running):

Notice: Catalog compiled by puppet.example.com
Notice: /Stage[main]/Systemd::Networkd/Service[systemd-networkd.socket]/ensure: ensure changed 'running' to 'stopped'
Notice: /Stage[main]/Systemd::Networkd/Service[systemd-networkd]/ensure: ensure changed 'running' to 'stopped'
Notice: Applied catalog in 0.44 seconds

Second puppet run (idempotency check):

Notice: Catalog compiled by puppet.example.com
Notice: Applied catalog in 0.38 seconds

No changes on the second run. ✓

CentOS Stream 10 agent / systemd 252 (Vagrant)

No systemd-networkd*.socket unit files present — networkd is not a separate package on RHEL. The fact returns an empty array and no socket resources are declared. Catalog applies cleanly with no socket-related changes. ✓

@miharp miharp marked this pull request as ready for review April 13, 2026 12:44
@miharp miharp requested review from bastelfreak and trefzer April 13, 2026 12:44
@trefzer
Copy link
Copy Markdown
Contributor

trefzer commented Apr 13, 2026

Debian testing (forky) is on systemd 260.1 (https://packages.debian.org/source/forky/systemd).

The problem is not to set the sockets to 'stopped'. But if ensure is set to 'running' then all available sockets will be enabled.
So there will be no possibility to finegrain enable and disable individual sockets (which is probably the purpose why several sockets exists !)

@miharp
Copy link
Copy Markdown
Author

miharp commented Apr 13, 2026

@trefzer Ah, I get it now, missed the fact that ensure => running implicitly makes it active :(

@miharp miharp force-pushed the fix-networkd-socket-activation branch from 1483c45 to 388d042 Compare April 13, 2026 17:55
Comment thread manifests/networkd.pp Outdated
On systemd 260+ (Archlinux rolling), additional socket units
(systemd-networkd-varlink.socket, systemd-networkd-varlink-metrics.socket,
systemd-networkd-resolve-hook.socket) cause systemd to immediately re-activate
systemd-networkd.service via socket activation when it is stopped. This makes
`networkd_ensure => stopped` non-idempotent on Archlinux.

Add a `systemd_networkd_socket_units` fact that discovers all
`systemd-networkd*.socket` unit files present on the node. The manifest
iterates over them, stopping all socket units before stopping the service.
This handles the current set of sockets as well as any added in future
systemd versions.

On platforms where systemd-networkd is not installed (RHEL/CentOS) the fact
returns an empty array and no socket resources are declared.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Michael Harp <mike@mikeharp.com>
@miharp miharp force-pushed the fix-networkd-socket-activation branch from 388d042 to fb7b2b5 Compare April 14, 2026 09:48
@bastelfreak
Copy link
Copy Markdown
Member

I want to point out that functions Always run on the server (except for referred functions) and facts on the agent.

@miharp
Copy link
Copy Markdown
Author

miharp commented Apr 14, 2026

I want to point out that functions Always run on the server (except for referred functions) and facts on the agent.

Yeah, I thought he might have been thinking about deferred functions which I always try and avoid :)

Copy link
Copy Markdown
Contributor

@trefzer trefzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks

@miharp
Copy link
Copy Markdown
Author

miharp commented Apr 17, 2026

@bastelfreak @trefzer ready to merge

@traylenator
Copy link
Copy Markdown
Contributor

Do we know why each of the sockets does not have a

PartOf=systemd-networkd.service

so the socket would stop on service stop would be automatic.

The behaviour of leaving the sockets behind is either deliberate or not.

Example that uses PartOf.

root@moving:~# systemctl is-active cups.socket cups.service
active
active
root@moving:~# systemctl stop cups.service
root@moving:~# systemctl is-active cups.socket cups.service
inactive
inactive

@traylenator
Copy link
Copy Markdown
Contributor

With a bit of digging I answered my own question.

Using PartOf of is bad since also restarts sockets on restart of the network and that's bad.

There is StopWhenUnneeded=yes but a good chance it would never kick in anyway.

@traylenator
Copy link
Copy Markdown
Contributor

Do wonder if there is ever an actual use case to stop systemd-networkd via puppet ? I mean just document that this particular one is not tested deliberately.

@traylenator
Copy link
Copy Markdown
Contributor

Or this is a case of user should be aware.

If they really want to stop all the network service it's up to them to also stop the sockets.

Add those explicit stops to the acceptance tests.

I'm not convinced trying to be smarter than what systemd is going to do is a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

networkd_ensure => stopped is not idempotent on Archlinux (systemd 260+)

4 participants