Skip to content

Conversation

@sdarwin
Copy link
Contributor

@sdarwin sdarwin commented Aug 20, 2025

This is to facilitate discussion of issue #4712

Usually the username on windows comes from runner_run_as: Administrator. Recently that has broken and it's always System.

In start-runner.ps1 move everything back inside a "Register-ScheduledTask" so the username can be set. This fixes the issue.

However, start-runner.ps1 also included a relatively new section:

Write-Host "Terminating instance"
aws ec2 terminate-instances --instance-ids "$InstanceId" --region "$Region"

By removing it, instances fail to terminate. How did this work before? ( in 2023) Why is it necessary to terminate instance from start-runner.ps1 instead of how it used to work before?

By removing terminate-instances, the instances continue to run, which is obviously not the desired result.

Where should terminate-instances be placed, or can the earlier solution handle that task.

@sdarwin sdarwin requested a review from a team as a code owner August 20, 2025 18:25
@npalm npalm marked this pull request as draft August 20, 2025 20:12
@sdarwin sdarwin changed the title start-runner.ps1 set username - DRAFT - don't merge start-runner.ps1 set username Sep 25, 2025
@sdarwin sdarwin changed the title start-runner.ps1 set username fix: start-runner.ps1 set username Sep 25, 2025
@sdarwin
Copy link
Contributor Author

sdarwin commented Sep 25, 2025

@npalm this PR has been tested and is ready to review.

Closes #4712

@sdarwin sdarwin marked this pull request as ready for review October 4, 2025 13:52
@npalm npalm requested a review from Copilot November 2, 2025 10:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the Windows runner lifecycle management to use scheduled tasks with dynamically generated scripts, and improves CloudWatch agent installation reliability by splitting the process into two distinct phases.

Key changes:

  • Refactored ephemeral Windows runner to generate and execute a separate PowerShell script via scheduled tasks instead of inline execution
  • Split CloudWatch agent installation into download and installation phases to ensure file system synchronization
  • Added clarifying comments about Terraform templating syntax and working directory context

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
modules/runners/templates/start-runner.sh Added comment explaining Terraform template double dollar sign syntax
modules/runners/templates/start-runner.ps1 Refactored to generate start-runner-service.ps1 dynamically and execute via scheduled task; added comments for clarity
images/windows-core-2022/windows-provisioner.ps1 Split CloudWatch agent installation into download and install phases with explanatory comments
images/windows-core-2019/windows-provisioner.ps1 Split CloudWatch agent installation into download and install phases with explanatory comments

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

$taskExecutable = "PowerShell.exe"
$taskArgument = "-File $startRunnerService"
if (Test-Path $startRunnerService) {
Remove-Item "$startRunnerService"
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation detected. This line uses a tab character while surrounding lines use spaces. Convert the tab to spaces to match the file's indentation style.

Suggested change
Remove-Item "$startRunnerService"
Remove-Item "$startRunnerService"

Copilot uses AI. Check for mistakes.

if ( $taskArgument ) {
$action = New-ScheduledTaskAction -WorkingDirectory "$pwd" -Execute "$taskExecutable" -Argument "$taskArgument"
}
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation. Line 223 uses 4 spaces while line 224 uses 2 spaces. The indentation should be consistent with the surrounding code blocks (4 spaces based on the rest of the file).

Suggested change
}
}

Copilot uses AI. Check for mistakes.
Start-Process "msiexec.exe" $cloudwatchParams -Wait -NoNewWindow
Remove-Item C:\amazon-cloudwatch-agent.msi

Write-Host "Creating actions-runner directory for the GH Action installtion"
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'installtion' to 'installation'.

Suggested change
Write-Host "Creating actions-runner directory for the GH Action installtion"
Write-Host "Creating actions-runner directory for the GH Action installation"

Copilot uses AI. Check for mistakes.
Start-Process "msiexec.exe" $cloudwatchParams -Wait -NoNewWindow
Remove-Item C:\amazon-cloudwatch-agent.msi

Write-Host "Creating actions-runner directory for the GH Action installtion"
Copy link

Copilot AI Nov 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'installtion' to 'installation'.

Suggested change
Write-Host "Creating actions-runner directory for the GH Action installtion"
Write-Host "Creating actions-runner directory for the GH Action installation"

Copilot uses AI. Check for mistakes.
@npalm
Copy link
Member

npalm commented Nov 2, 2025

Only in case of ephemeral runners, the instance itself should terminate after the job is completed. This is done by running th runner in the foreground (linux) once the process is finsihed it exits the runner process and terminates the instance.

@sdarwin
Copy link
Contributor Author

sdarwin commented Nov 2, 2025

You write "Only in case of ephemeral runners, the instance itself should terminate after the job is completed."

There was a somewhat new section in start-runner.ps1, doing this:

Write-Host "Terminating instance"
aws ec2 terminate-instances --instance-ids "$InstanceId" --region "$Region"

What the PR is doing now is moving that code into a start-runner-service.ps1 file which is called with the correct username.

  • The terminate-instances command is present.
  • the script is called by a configurable windows username, jobs may run as 'Administrator' or other user.

Copy link
Member

@npalm npalm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sdarwin Running a deployment test based on the multi-runner example in the examples directoruy. The change is not working. I had no time to dig deep. But effect is the job got not processed. Assuming something going wrong in the registration of or starting of the windows runner.

The PR als has a merege conflict, but strange enough github is not showing me the issue.

Would be great if you have some time to check the PR also with the confifguration in the mutli-runner example for windows. Or a similar example.

@sdarwin
Copy link
Contributor Author

sdarwin commented Nov 3, 2025

The PR has a merge conflict

Another modification appeared recently, and caused a merge conflict. I believe it's resolved now.

The change is not working.

I've just tested this, and it worked for me. Microsoft Windows seems to be slow, it took a long time to launch. Set all timeouts such as runner_boot_time_in_minutes and minimum_running_time_in_minutes , etc , to 20 minutes.

There are at least 3 types of runners:

  • ephemeral jit
  • ephemeral non-jit
  • non-ephemeral

In your environment currently, are all of these working with Microsoft Windows 2019/2022/2025? Are you testing them? That is a first step before this PR.

If they are all fine, as-is, then next try this pull request. Let me know specifically for each of these if you are seeing either success or failure, and what happens:

  • ephemeral jit
  • ephemeral non-jit
  • non-ephemeral

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants