fix(commands): apply batch — multi-node, node-file patch merge, error handling#128
fix(commands): apply batch — multi-node, node-file patch merge, error handling#128
Conversation
The field has been declared on engine.Options since the file was created (c567135, 2024-05) but was never read by Render or any helper. apply.go and template.go were assigning it from their respective --insecure flags; apply_test.go asserted the assignment. Nothing else looked at the value. The runtime guarantee — --insecure (maintenance mode) bypasses FailIfMultiNodes — is provided by WithClientMaintenance not injecting nodes into the gRPC context, not by anything inside engine.Render. The dead field added nothing on top of that. Drop the field, the assignments, and the assertion. golangci-lint is clean. Closes #123 Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
helpers.FailIfMultiNodes(ctx, name) embeds the name in its error message. The call site in engine.Render hardcoded "talm template", which was accurate when Render was only called from the template subcommand. PR #119 made apply call Render too, so users running `talm apply` with a multi-node modeline saw an error talking about `talm template` — confusing. Add Options.CommandName, default to "talm" when empty, set "talm apply" in apply's buildApplyRenderOptions and "talm template" in template's option-build. TestRenderFailIfMultiNodes_UsesCommandName covers both subcommands plus the empty-string fallback and explicitly asserts the historical "talm template" no longer leaks into the apply case. Closes #121 Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
…turns
The per-file loop in apply had two branches with asymmetric error
handling:
- template-render: `err = withApplyClient(...)` (assigning the outer
loop variable) followed by an outer `if err != nil { return err }`
- direct-patch: `if err := withApplyClient(...); err != nil { return err }`
(local scope, immediate return)
The outer check only caught errors from the template-render branch.
Any future tweak to the direct-patch branch could silently swallow an
error — exactly the shadowing class that already bit the project once
(commit b34781e).
Switch the template-render branch to the same local-scope idiom and
delete the outer check. Both branches now read identically and there
is no dead code path to misuse.
Closes #122
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Before: when a node file's modeline contained `templates=[...]`, both
`talm apply -f node.yaml` and `talm template -f node.yaml` rendered
the listed templates and discarded the rest of the node file. Per-node
hostname, secondary interfaces with deviceSelector, VIP placement, etcd
extraArgs, certSANs and time servers all silently vanished — a
multi-NIC control plane could not bootstrap because the etcd
advertised-subnet interface was missing from the applied config.
Add engine.MergeFileAsPatch using Talos configpatcher (LoadPatches +
Apply). Apply the node file as a strategic merge patch over the
rendered template in both:
- apply.go template-rendering branch (after engine.Render, before
c.ApplyConfiguration)
- template.go templateWithFiles (after generateOutput, before
inplace-write or stdout)
Same merge step in both keeps the documented piped flow
`talm template -f X | talm apply -f -` carrying the node body through
end to end.
A modeline-only node file (no Talos config body) becomes a patch with
empty content; LoadPatches still returns a patch object, but Apply has
nothing to merge — every rendered field round-trips through the
configpatcher's loader unchanged.
TestMergeFileAsPatch covers both paths: the body-overlay case asserts
the custom hostname and the deviceSelector secondary interface land in
the merged output and the auto-generated hostname is gone; the
modeline-only case asserts every rendered field survives.
Closes #126
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
PR #119 moved talm apply's template rendering from offline to online. The new online path runs inside withApplyClient, whose wrapWithNodeContext helper batches every node from GlobalArgs.Nodes into a single gRPC context. engine.Render then calls helpers.FailIfMultiNodes which rejects multi-node contexts, so a node file with `nodes=[ip1, ip2]` (or `--nodes A,B,C` on the command line) fails before any rendering happens. The pre-#119 direct-patch flow handled multi-node fan-out at the wire level inside ApplyConfiguration, so this never surfaced. Render once per node instead. Add applyTemplatesPerNode that takes a nodes slice and injection points for the render and apply functions, then iterates: for each node it builds a single-node context with client.WithNodes, calls engine.Render (which now sees one node and satisfies the FailIfMultiNodes guard), merges the modeline'd file as a patch, and applies the result. Per-node iteration is also the correct semantic — discovery via lookup() resolves each node's own network topology rather than mashing everything together. Split withApplyClient into a public form (still wraps with the legacy multi-node context for the direct-patch branch) and withApplyClientBare which skips the wrap so the per-node loop can attach contexts itself. Three tests pin the contract: that each render and apply call sees exactly one node in its outgoing-context metadata; that engine.Render with a batched multi-node context still errors (the pre-condition the loop exists to satisfy); that an empty nodes slice errors loudly rather than silently doing nothing. Manually verified the loop assertion fails when the per-iteration context is built with all nodes instead of one. Closes #120 Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
…ure mode
Branch review caught four blockers and three majors in the original
implementation. Fixes:
- engine.MergeFileAsPatch now short-circuits modeline-only patch files
(BLOCKER 2). The Talos config-patcher misclassifies a YAML-comment-only
file as an empty JSON6902 patch, then refuses any multi-document
rendered config (the v1.12+ output format) with 'JSON6902 patches are
not supported for multi-document machine configuration'. Detect 'no
meaningful YAML content' before LoadPatches and return rendered
unchanged in that case. Fix also covers empty files and files with only
comments / document separators / whitespace.
- talm template no longer routes its output through MergeFileAsPatch
(BLOCKER 1, BLOCKER 5, MEDIUM 8). The patcher round-trips through its
config loader, normalising YAML formatting, sorting keys, and (worst
of all) stripping every YAML comment — which deletes the talm modeline
('# talm: ...') and the AUTOGENERATED warning that downstream commands
rely on. Removing the template-side merge keeps stdout and --in-place
outputs intact. The merge stays in apply because that output goes
straight to ApplyConfiguration where formatting and comments do not
matter.
- The pipe-flow comment that motivated the template-side merge is gone
(MAJOR 6). talm apply does not in fact accept stdin via '-f -'
(ExpandFilePaths only handles real paths), so the documented workflow
was a fiction.
- Insecure (maintenance) mode now opens a fresh single-endpoint client
per node (BLOCKER 4). WithClientMaintenance creates one client with
every endpoint and gRPC then round-robins ApplyConfiguration across
them, so most nodes never received the config. Narrow GlobalArgs.Nodes
to the iteration's node and restore it via defer; this requires
splitting applyTemplatesPerNode's client-acquisition into an injected
openClientFunc with two production implementations
(openClientPerNodeMaintenance, openClientPerNodeAuth).
- TestApplyTemplatesPerNode_BatchedContextIsRejected was renamed and
rewritten as TestApplyTemplatesPerNode_NeverBatchesNodes, which
asserts the loop itself never produces a multi-node ctx (MAJOR 7).
The previous test happened to call engine.Render and pass; it would
not have caught a regression in applyTemplatesPerNode.
- Two new tests cover the insecure path
(TestApplyTemplatesPerNode_MaintenanceModeOpensFreshClientPerNode and
TestOpenClientPerNodeMaintenance_NarrowsAndRestoresGlobalNodes).
TestMergeFileAsPatch grew three sub-tests covering multi-doc patching,
empty files, and comments-only files. All five new tests fail without
the corresponding fix.
- Doc comments cleaned up (MINORS 9, 10, 11): the FailIfMultiNodes
rationale lives in one place (applyTemplatesPerNode); MergeFileAsPatch
wraps the out.Bytes() error with file context; withApplyClient and
withApplyClientBare now describe what they actually do.
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
…ferences - engine.MergeFileAsPatch now reads the patch file once (was twice — one ReadFile and one LoadPatches with @-syntax) by switching from configpatcher.LoadPatches to LoadPatch on the already-read bytes. No TOCTOU window between the empty-detection check and patch loading. - New TestMergeFileAsPatch sub-test covers the realistic Talos v1.12+ apply scenario the previous test set missed: multi-document rendered config (legacy machine/cluster doc plus typed HostnameConfig and LinkConfig docs) overlaid by a non-empty strategic-merge patch in the node file. The assertion checks that the legacy machine doc absorbs the patch and the sibling typed docs survive untouched. - New TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation pins the intentional divergence between talm template and talm apply for files with a non-empty body. Apply overlays the body via the config-patcher before sending; template emits the rendered bytes verbatim so the modeline and the AUTOGENERATED-warning comment survive in stdout/in- place output. A future patch that tries to make template match apply will trip this test. - openClientPerNodeAuth now uses client.WithNode (single-target metadata) instead of client.WithNodes (aggregation metadata). The per-iteration intent is one node, not a collection of one — naming matches semantics. Test helper nodesFromOutgoingCtx reads either key so the assertions stay valid. - Removed all internal-tracker references and review-round annotations from test docstrings. Tests are now self-describing: each one names the contract it pins, not the issue ticket that motivated it. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
- Sync openClientFunc docstring with the WithNode (single-target) metadata key the auth-mode opener actually uses; the old docstring still mentioned WithNodes from the previous version. - TestOpenClientPerNodeMaintenance_NarrowsAndRestoresGlobalNodes now drives the production openClientPerNodeMaintenance with an injected fake maintenanceClientFunc instead of an inline stub. The fake snapshots GlobalArgs.Nodes at the moment WithClientMaintenance reads it, proving the narrow-and-restore-via-defer logic is exercised end to end. A regression in the production function now fails this test (previously it would have continued to pass against the test-local stub). - New TestApplyTemplatesPerNode_AuthModeUsesSingleNodeMetadataKey pins the gRPC metadata key the auth-mode opener writes so a future swap back to WithNodes (which would round-trip through nodesFromOutgoingCtx unnoticed) gets caught. - README documents that node files can carry per-node patches in their body, that talm apply applies them as a strategic merge over the rendered template, and that talm template intentionally does not. Recommends apply --dry-run for previewing the exact bytes apply will send. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
CommandName is read by engine.Render to format the FailIfMultiNodes error wording. The engine-side assertion (TestRenderFailIfMultiNodes_UsesCommandName) verifies Render does the right thing with whatever value it receives, but nothing was pinning that buildApplyRenderOptions sets it to "talm apply" in the first place. A future refactor that drops the field would slip through unnoticed. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 22 minutes and 1 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughThe changes implement per-node template rendering with merge patch semantics in Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Apply as apply.go
participant Client as Talos Client
participant Engine as engine.go
participant Node as Talos Node
User->>Apply: talm apply -f node.yaml<br/>(with templates= in modeline)
loop Per-node (applyTemplatesPerNode)
Apply->>Client: Open per-node client<br/>(maintenance or auth mode)
Apply->>Engine: Render(ctx, opts)<br/>render template
Engine-->>Apply: rendered bytes
Apply->>Engine: MergeFileAsPatch<br/>(rendered, node.yaml)
Engine->>Engine: Load patch file<br/>Apply strategic merge
Engine-->>Apply: merged bytes
Apply->>Client: ApplyConfiguration(merged)
Client->>Node: Apply merged config
Node-->>Client: OK
Client-->>Apply: Success
end
Apply-->>User: Applied to all nodes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors the talm apply template-rendering path to process nodes individually, enabling per-node configuration overlays via strategic merge patches. It introduces the MergeFileAsPatch utility and adds comprehensive tests for both authenticated and maintenance modes. Review feedback suggests ensuring node resolution falls back to the Talos context when CLI arguments are missing, supporting multi-document patches in node files, improving error messaging for missing nodes, and optimizing performance by moving file I/O for patches outside the per-node loop.
| if err := withApplyClientBare(func(parentCtx context.Context, c *client.Client) error { | ||
| openClient := openClientPerNodeAuth(parentCtx, c) |
There was a problem hiding this comment.
In the authenticated template rendering path, the list of nodes is captured from GlobalArgs.Nodes before the client is created. Unlike the direct patch path (which uses wrapWithNodeContext), this path does not fall back to the nodes defined in the talosconfig context if GlobalArgs.Nodes is empty. This means talm apply will fail if nodes are not explicitly provided via CLI or modeline, even if they are defined in the current Talos context.
if err := withApplyClientBare(func(parentCtx context.Context, c *client.Client) error {
if len(nodes) == 0 {
if configContext := c.GetConfigContext(); configContext != nil {
nodes = configContext.Nodes
}
}
openClient := openClientPerNodeAuth(parentCtx, c)There was a problem hiding this comment.
Added the same fallback in 5a16ad3. Extracted resolveAuthTemplateNodes so the precedence (CLI/modeline first, talosconfig context second) is unit-testable and symmetric with what wrapWithNodeContext already did for the direct-patch branch. Insecure (maintenance) mode still uses only GlobalArgs.Nodes — the maintenance client talks to node IPs directly and carries no talosconfig context to consult.
| patch, err := configpatcher.LoadPatch(patchBytes) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("loading patch from %s: %w", patchFile, err) | ||
| } | ||
| out, err := configpatcher.Apply(configpatcher.WithBytes(rendered), []configpatcher.Patch{patch}) |
There was a problem hiding this comment.
configpatcher.LoadPatch is used to load the patch from the node file. However, LoadPatch typically only handles a single YAML document. Since Talos v1.12+ and talm templates often produce multi-document configurations, users might need to provide multi-document patches in their node files to target different parts of the configuration. Using configpatcher.LoadPatches (plural) would be more consistent with other parts of the codebase (like FullConfigProcess) and provide better support for complex configurations.
| patch, err := configpatcher.LoadPatch(patchBytes) | |
| if err != nil { | |
| return nil, fmt.Errorf("loading patch from %s: %w", patchFile, err) | |
| } | |
| out, err := configpatcher.Apply(configpatcher.WithBytes(rendered), []configpatcher.Patch{patch}) | |
| patch, err := configpatcher.LoadPatches([]string{string(patchBytes)}) | |
| if err != nil { | |
| return nil, fmt.Errorf("loading patches from %s: %w", patchFile, err) | |
| } | |
| out, err := configpatcher.Apply(configpatcher.WithBytes(rendered), patch) |
There was a problem hiding this comment.
Checked the Talos source — configpatcher.LoadPatch already routes through configloader.NewFromBytes first, which handles multi-document Talos configs as a strategic-merge patch. A multi-doc node body therefore already works today; configpatcher.LoadPatches([]string{string(patchBytes)}) with a single-element slice would resolve to the same LoadPatch call and add no capability. LoadPatches only splits its input when the caller supplies multiple strings (typically @file references), not by splitting a single string on ---. Keeping LoadPatch here.
| apply applyFunc, | ||
| ) error { | ||
| if len(nodes) == 0 { | ||
| return fmt.Errorf("no nodes specified for template-rendering apply") |
There was a problem hiding this comment.
The error message when no nodes are specified is less descriptive than the one used in the template command. Providing a consistent and helpful error message that suggests how to fix the issue (e.g., using the --nodes flag or modeline) would improve the user experience.
| return fmt.Errorf("no nodes specified for template-rendering apply") | |
| return fmt.Errorf("nodes are not set for the command: please use '--nodes' flag or configuration file to set the nodes to run the command against") |
There was a problem hiding this comment.
Reworded in 599079d to name the three concrete ways to set nodes (--nodes, modeline, talosconfig context) and strengthened TestApplyTemplatesPerNode_NoNodesIsAnError to assert the guidance stays put through future rewords. With the talosconfig-context fallback from 5a16ad3 this error now only fires when all three sources are empty.
| // ApplyConfiguration) but unsuitable for human-facing output such as | ||
| // `talm template` — which is why the template subcommand does not call | ||
| // this helper. | ||
| func MergeFileAsPatch(rendered []byte, patchFile string) ([]byte, error) { |
There was a problem hiding this comment.
The MergeFileAsPatch function reads the patchFile from disk and performs an 'effectively empty' check on every call. Since this function is called within a loop over nodes in applyTemplatesPerNode (via renderMergeAndApply), it leads to redundant file I/O and processing. For large clusters, this could become a performance bottleneck. Consider refactoring this to load the patch data once outside the per-node loop and pass the pre-loaded data or patch object to the rendering logic.
There was a problem hiding this comment.
After the multi-node + non-empty body guard in 66a9489, any non-empty patch is guaranteed to target exactly one node — so MergeFileAsPatch reads and parses the file once per apply. For a single-node call the read/parse is ≤1 ms and comfortably dominated by template render and gRPC apply; a refactor to split file I/O from the merge would pay for itself only on large node-body files applied to hundreds of nodes, which the new guard forbids. Happy to revisit if a use case emerges that the guard does not cover.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/commands/apply.go`:
- Around line 251-265: applyTemplatesPerNode currently re-uses a single
configFile body for every node and calls renderMergeAndApply for each target,
which causes node-specific overlays (hostname, addresses, VIPs) to be applied to
all nodes; update applyTemplatesPerNode to detect when the provided configFile
contains a non-empty overlay and, if len(nodes) > 1, reject the operation with a
clear error, or alternatively resolve/split the overlay per target before
entering the for loop so each openClient call uses a node-specific config;
ensure the same change/error check is applied to the other similar call site
that invokes renderMergeAndApply (the second occurrence referenced in the
comment).
In `@README.md`:
- Around line 143-158: Adjust the paragraph about "Per-node patches inside node
files" to avoid claiming legacy node-body fields (e.g., deviceSelector
interfaces, VIPs, machine.network.interfaces) survive talm apply for Talos
v1.12+ multi-doc output: either restrict the statement to legacy/single-doc
template behavior only, or explicitly add that for v1.12+ multi-doc mode users
must patch the typed resources
(LinkConfig/BondConfig/VLANConfig/Layer2VIPConfig) because the multi-doc path
intentionally ignores legacy machine.network.interfaces due to lack of a safe
1:1 translation; mention talm apply -f node.yaml and talm template -f node.yaml
where appropriate to guide readers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d8100954-d5ef-4b44-96b9-51a4d161b29a
📒 Files selected for processing (6)
README.mdpkg/commands/apply.gopkg/commands/apply_test.gopkg/commands/template.gopkg/engine/engine.gopkg/engine/render_test.go
| func TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation(t *testing.T) { | ||
| const rendered = `version: v1alpha1 | ||
| machine: | ||
| type: controlplane | ||
| network: | ||
| hostname: talos-abcde | ||
| ` | ||
| dir := t.TempDir() | ||
| nodeFile := filepath.Join(dir, "node.yaml") | ||
| const nodeBody = `# talm: nodes=["10.0.0.1"] | ||
| machine: | ||
| network: | ||
| hostname: node0 | ||
| ` | ||
| if err := os.WriteFile(nodeFile, []byte(nodeBody), 0o644); err != nil { | ||
| t.Fatalf("write node file: %v", err) | ||
| } | ||
|
|
||
| // apply path: renders, then overlays. | ||
| merged, err := engine.MergeFileAsPatch([]byte(rendered), nodeFile) | ||
| if err != nil { | ||
| t.Fatalf("MergeFileAsPatch: %v", err) | ||
| } | ||
| if !strings.Contains(string(merged), "hostname: node0") { | ||
| t.Errorf("apply path must overlay hostname: node0; got:\n%s", string(merged)) | ||
| } | ||
|
|
There was a problem hiding this comment.
🟡 The 'template path' block in TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation contains two assertions that are permanently vacuous: templateOutput := []byte(rendered) simply copies the local constant, so bytes.Equal(templateOutput, []byte(rendered)) is always true (the negated check never fires) and strings.Contains(string(templateOutput), "hostname: node0") is always false because the rendered constant contains hostname: talos-abcde. The test comment claims this block will catch a future 'make template match apply' regression, but no production template code is exercised, so such a regression would pass undetected. The apply-path half of the test (calling engine.MergeFileAsPatch) is legitimate; only the template-path assertions are hollow.
Extended reasoning...
What the bug is and how it manifests
In TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation (pkg/commands/apply_test.go), the 'template path' block is supposed to pin the invariant that talm template does NOT overlay the node-file body onto the rendered output. Instead, the block performs no actual test: it creates templateOutput as a direct byte-copy of the rendered constant and then asserts two things that are structurally impossible to fail.
The specific code path that triggers it
Lines 543-569 of apply_test.go (template-path block):
Assertion 1: bytes.Equal(templateOutput, []byte(rendered)) compares rendered to itself. The result is always true, so the negated guard is always false and t.Error is never reached.
Assertion 2: strings.Contains(string(templateOutput), "hostname: node0") searches for node0 inside rendered, which contains hostname: talos-abcde — never node0. This condition is always false and t.Error is never reached.
Why existing code does not prevent it
Neither the compiler nor go test has any way to detect that a predicate is structurally always-false. The test compiles and passes on every run, regardless of what generateOutput(), templateWithFiles(), or any related production code does.
What the impact would be
If a future contributor makes talm template call engine.MergeFileAsPatch (the exact regression the test comment warns about), both assertions would still pass. The first passes because templateOutput is still the copy of the constant rendered, not the output of generateOutput; the second passes because rendered still does not contain node0. The test gives false confidence in the template/apply divergence invariant.
How to fix it
The template-path block should actually call the production template code — specifically generateOutput() (or an equivalent helper) — capture its output as templateOutput, and then assert the two properties. That way a change to generateOutput that starts applying the overlay will cause hostname: talos-abcde to be replaced by hostname: node0 in the output, making the second assertion fire as intended.
Step-by-step proof
renderedis the constant string containinghostname: talos-abcde.templateOutput := []byte(rendered)creates a byte slice with exactly the same content asrendered.bytes.Equal(templateOutput, []byte(rendered))— both sides are identical byte sequences → always returnstrue.- The guard
\!bytes.Equal(...)is alwaysfalse→ the firstt.Erroris unreachable on every run. strings.Contains(string(templateOutput), "hostname: node0")searches fornode0in a string that only containstalos-abcde→ always returnsfalse.- The second
t.Erroris also unreachable on every run. - Both assertions pass permanently regardless of what the production template code does, providing only false confidence in the divergence invariant.
There was a problem hiding this comment.
Fixed in 294b4ce. You're right — templateOutput := []byte(rendered) made both assertions structurally vacuous. Deleted the vacuous block and left a comment explaining that the no-overlay invariant for talm template is structural rather than runtime (pkg/commands/template.go never calls engine.MergeFileAsPatch, and a regression that wired the merge into generateOutput would surface in the modeline round-trip tests — the assert-against-the-constant pattern cannot pin it). The apply-path half of the test keeps its meaningful MergeFileAsPatch assertion.
Address review feedback from gemini-code-assist on pkg/commands/apply.go: the authenticated template-rendering path captured nodes from GlobalArgs.Nodes and never consulted the client config context, so a user who had already run 'talosctl config node <ip>' still had to repeat --nodes on every talm apply. The direct patch branch already consults the context via wrapWithNodeContext — this aligns the template branch with the same behavior. Extract resolveAuthTemplateNodes so the precedence (CLI/modeline beats talosconfig context) is testable in isolation. Insecure mode deliberately opts out: maintenance clients connect to node IPs directly and carry no talosconfig context, so GlobalArgs.Nodes remains the only source there. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Address review feedback from gemini-code-assist on pkg/commands/apply.go:260. The terse 'no nodes specified for template-rendering apply' did not tell the user how to fix the problem. Use the same phrasing style as the template command so the message names the concrete ways to set nodes (--nodes flag, node file modeline, talosconfig context). Strengthen TestApplyTemplatesPerNode_NoNodesIsAnError to assert the guidance survives a cosmetic reword but catches a regression that drops it. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Address review feedback from coderabbitai on pkg/commands/apply.go: applyTemplatesPerNode renders and applies per target, but renderMergeAndApply merges the same configFile body on every iteration. A node file whose modeline targets multiple nodes with a non-empty body (pinned hostname, address, VIP, etc.) therefore stamped the same value onto every machine — the opposite of what the overlay feature is for. Reject the combination up-front in applyTemplatesPerNode: when more than one target is scheduled and engine.NodeFileHasOverlay reports the file carries a real body below its modeline, surface an error that names the file, the targets, and the remediation (split into one file per node or drop the per-node fields). Modeline-only files remain allowed — they are the common bootstrap shape where the same rendered template lands on N nodes. Expose engine.NodeFileHasOverlay so the command layer can peek at the file without reading it twice or duplicating isEffectivelyEmptyYAML. Add regression tests for both the rejected and allowed shapes, plus a standalone table test for NodeFileHasOverlay. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Address review feedback from coderabbitai on README.md: the per-node-patches note promised that legacy node-body fields such as deviceSelector interfaces and VIPs survive talm apply, but on the Talos v1.12+ multi-doc path machine.network.interfaces fragments have no safe 1:1 mapping to LinkConfig/BondConfig/VLANConfig/ Layer2VIPConfig, so they do not translate. As written, the text misled v1.12+ users into relying on overrides that are not represented semantically the way the docs implied. Add an explicit v1.12+ caveat that tells users to patch the typed resources for per-node network settings, and call out the new one-body-one-node guard (talm apply now refuses a multi-node modeline with a non-empty body). Fields outside the network area still merge as before. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Address review feedback on apply_test.go: the 'template path' block of TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation compared the rendered constant against itself and searched for a string it could not contain, so both assertions passed regardless of what production code does — a future regression that wired engine.MergeFileAsPatch into generateOutput would still sail through. Delete the vacuous block and leave a comment explaining that the no-overlay invariant for talm template is structural (template.go never calls MergeFileAsPatch) and enforced by the modeline round-trip tests rather than a runtime assertion here. The apply-path half (MergeFileAsPatch actually merges) keeps its meaningful assertion. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
Summary
Five issues clustered in
pkg/commands/apply.goandpkg/engine/engine.go. All were either introduced or surfaced by PR #119 (online template rendering in apply). Bundled because they share the same callback structure and overlap heavily.Per-node template rendering in
talm apply. PR fix(commands): render templates online in apply to resolve lookups #119 moved rendering inside the Talos client callback solookup()resolves real discovery data, but the callback batched all--nodes A,B,C(or modelinenodes=[A,B,C]) into a single gRPC context.engine.Renderthen hithelpers.FailIfMultiNodesand bailed out before any rendering happened. Render once per node instead, each iteration carrying a single-nodeclient.WithNodecontext so the guard passes and discovery resolves per node. In insecure (maintenance) mode each iteration opens a fresh single-endpoint maintenance client becauseWithClientMaintenancewould otherwise dial every endpoint at once and gRPC would round-robinApplyConfigurationacross them — most nodes would never receive the config.Node file is now applied as a patch over the rendered template. When a modeline contained
templates=[...],talm apply -f node.yamlrendered the listed templates and discarded the rest of the node file. Per-node hostname, secondary interfaces withdeviceSelector, VIP placement, etcd extra args — all silently lost. Newengine.MergeFileAsPatchoverlays the body using the Talos config-patcher; modeline-only and effectively-empty files short-circuit before reaching the patcher (otherwise the patcher routes empty input through JSON6902 and rejects any multi-document rendered config — the v1.12+ default output format).talm templateintentionally does not call this overlay because the patcher round-trips through its config loader, dropping the modeline and YAML comments downstream commands rely on; the README documents the divergence.Multi-node error message references the actual subcommand.
helpers.FailIfMultiNodeswas hardcoded to"talm template". After fix(commands): render templates online in apply to resolve lookups #119, users runningtalm applysaw an error talking abouttalm template. ThreadedCommandNamethroughengine.Options; defaults to"talm"when empty.Symmetric error handling in the apply per-file loop. The template-render branch assigned
err = withApplyClient(...)and relied on an outerif err != nil; the direct-patch branch already used local-scopeif err := ...; err != nil { return err }. Two idioms, one outer check that only caught one branch — the same shadowing class that already bit the project once. Both branches now use the local-scope return; the outer check is gone.Removed dead
engine.Options.Insecure. The field has been declared since the file was first added (May 2024) and was never read byRenderor any helper. The runtime guarantee —--insecure(maintenance mode) bypassesFailIfMultiNodes— comes fromWithClientMaintenancenot injecting nodes into the gRPC context, not from anything in the engine.Verification
TestRenderFailIfMultiNodes_UsesCommandName— three sub-tests asserting theCommandNamevalue surfaces in the error and the empty-string fallback.TestMergeFileAsPatch— overlay (single-doc + non-empty patch), modeline-only byte-identity, multi-doc + modeline-only (the regression vector), multi-doc + non-empty strategic-merge patch (the realistic v1.12+ scenario), empty file, comments-only file.TestApplyTemplatesPerNode_*— per-node single-node ctx; never-batches; empty-nodes errors loudly; maintenance mode opens a fresh client per iteration; auth mode writes the single-targetnodemetadata key.TestOpenClientPerNodeMaintenance_NarrowsAndRestoresGlobalNodes— drives the production opener with an injectedmaintenanceClientFuncfake that snapshotsGlobalArgs.Nodesat the momentWithClientMaintenancereads it.TestTemplateAndApplyDiverge_NodeBodyOverlayLimitation— pins the intentional template/apply divergence so a future "make template match apply" change is caught.TestBuildApplyRenderOptions— pins that the apply caller actually setsCommandName(independent of the engine-side coverage).golangci-lint run ./...is clean.Closes
Summary by CodeRabbit
Release Notes
New Features
talm apply, preserving per-node configurations even when templates generate conflicting values.talm apply --dry-runto preview exact bytes sent to nodes.Documentation
talm applyandtalm templatecommands.