Skip to content

Update for downgraded package #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MSP-Greg opened this issue May 1, 2025 · 15 comments · Fixed by #26
Closed

Update for downgraded package #25

MSP-Greg opened this issue May 1, 2025 · 15 comments · Fixed by #26

Comments

@MSP-Greg
Copy link
Collaborator

MSP-Greg commented May 1, 2025

@eregon & @ntkme

I think I've mentioned that the whole 'system' of creating archives (7z files) is in need of an update.

  1. Several years ago Ruby was released with an upgrade from OpenSSL 1.1 to OpenSSL 3, and I made some bad choices then.
  2. That problem required 'downgrade' archive packages, and I suspected it would occur again, which has now occurred with gcc.
  3. Going forward, this may become more common (build tool/package incompatibility), so I'd like to make it easier to update, and also clearer as to how it works.

So, what I'd like to do is set it up so the package/downgrade info is more 'data/configuration' driven, rather than the current 'code' driven approach.

An example of what I think will be needed is shown below in a new GHA matrix:

matrix:
  include:
    # The below jobs have no package downgrades
    - { os: windows-2022  , file: msys2      , ruby: 3.4   }
    - { os: windows-11-arm, file: msys2-arm64, ruby: head  }
    - { os: windows-2022  , file: mingw64    , ruby: mingw }
    - { os: windows-2022  , file: ucrt64     , ruby: ucrt  }
    - { os: windows-11-arm, file: clangarm64 , ruby: head  }
    - { os: windows-2022  , file: mswin      , ruby: mswin }
    # The below jobs have package downgrades, at present, GCC and/or OpenSSL
    # All downgrade packages must be stored in the 'new name' release
    - { os: windows-20222 , file: ucrt64-gcc14        , ruby: 3.2, downgrade: gcc-14.2.0-3 }
    - { os: windows-20222 , file: ucrt64-gcc14-ssl1.1 , ruby: 3.1, downgrade: gcc-14.2.0-3 openssl-1.1.1.w-1 }
    - { os: windows-20222 , file: mingw64-gcc14-ssl1.1, ruby: 3.0, downgrade: gcc-14.2.0-3 openssl-1.1.1.w-1 }
  1. I intend to use a new release for these packages, so the work can be done without affecting current code in setup-ruby.
  2. One issue is how to push code to setup-ruby for the proper selection of which build tool archive files to install, based on the Ruby version and platform. This code may be messy, since patch versions may be needed for proper selection, eg Ruby 3.4.3 should use gcc-14, but Ruby 3.4.4 should use gcc-15 (assuming backports). I'd like to keep the data here, possibly in a json or yaml format, and transfer it in some way to setup-ruby, which is where the selection occurs.

Any thoughts?

@hsbt I'm not sure where to look. Is there a stated policy regarding backports for updated build tool/package issues? If they are backported to 'bug' supported versions, would they also be done for Ruby versions in 'security only' updates? Thanks for adding the Ubuntu gcc-15 CI...

*EDIT: I have not mentioned. MSYS2 uses pacman for its package manager, and pacman only allows one package to be installed. Unlike some other popular package managers, pacman will only install one version. So, to downgrade a package, we need to store the downgrade package (MSYS2 only keeps old packages for a limited time), and use a different install command.

@ntkme
Copy link
Contributor

ntkme commented May 1, 2025

This is the problem of serving the compiler tool chain as part of setup-ruby for users’ convenience. We could have said if you need tool chain run ridk install or use msys2/setup-msys2 action. However, as the community have been relying on this for so long, it’s too late to make this change.

As we will likely continue to provide bundled tool chains, I have a few comments if we actually get a chance to rework this from scratch:

  • Do not rely on the preinstalled msys2 in the hosted x64 runners, let’s just install and snapshot a full msys2 somewhere inside c:/hostedtoolcache/windows. This helps us avoid any possible runner image inconsistency in the future that the delta package may become problematic. It will also improve the compatibility with self-hosted runners. In reality the delta package only saves a few seconds max per run on hosted runners, I think it’s a premature optimization that’s simply not worth it.
  • Consider bundle windows ruby and msys2 as a single package by pre-running the setup in ruby/ruby-builder. If user choose to not have the tool chain, we can still download the same package, just skip adding msys2 to PATH. By providing a single package of preinstalled ruby and msys2, it would also allow us to effectively have an “immutable” msys2 snapshot that is tied to each ruby release. The downside is that users won’t get any updates automatically (especially security updates to OpenSSL), that we will have to manually update the “immutable” snapshots for different ruby versions every time. However, the upside is that because it’s effectively a frozen environment it won’t have regression caused by upgrades. One more thing is that it simplifies the situation that you won’t need to have a mapping of what version of toolchain tool chain to download for which ruby maintained in setup-ruby.

@eregon
Copy link
Member

eregon commented May 1, 2025

The proposed names LGTM.

4. I'd like to keep the data here, possibly in a json or yaml format, and transfer it in some way to setup-ruby, which is where the selection occurs.

Yes, JSON would be fine. It could also be code as a .js file without dependencies (or only well-known ones like actions toolkit core, but a bit dangerous as we could have different versions of that package).
Either way we could copy the file from this repo to setup-ruby every time it's needed.
I'm also OK with just a snippet of code/a function we copy to setup-ruby and mention where it comes from.

  • In reality the delta package only saves a few seconds max per run on hosted runners, I think it’s a premature optimization that’s simply not worth it.

I don't recall the exact numbers but extraction is unfortunately very slow on Windows (downloading is fast), picking for example https://github.com/ruby/setup-ruby/actions/runs/14765135437/job/41454925639?pr=761#step:3:26 gives:
Extracting msys2 build tools
Took 45.18 seconds
Extracting ucrt64-3.0 build tools
Took 35.48 seconds

I don't know if that one is incremental or not, but yeah the motivation for "just download the delta" is to speedup that.
It'd be good to have some numbers to compare, adding a few seconds when it already takes over a minute is probably fine, but adding something like a minute seems not so good.

I agree though in principle it would be simpler and safer to ignore any preinstalled msys2. Though could that cause problems if two msys2 installations exist, especially if one is in a standard location and we might end up using a mix of both?
Is the preinstalled msys2 on PATH?

  • Consider bundle windows ruby and msys2 as a single package by pre-running the setup in ruby/ruby-builder.

Interesting. Where would we store such bundled ruby + toolchain? Here maybe?
One thing is I don't want to handle building or rebuilding those, because I know too little about Windows, and I don't like rebuilding releases (doing so in ruby-builder for non-Windows is a pain).
I think this approach is only viable if we very very rarely need to rebuild. Not sure whether that's the case (e.g., over time C extensions might expect a newer compiler/C standard, or another thought is Visual Studio compilers seem to have frequent bugs (from what I see on https://bugs.ruby-lang.org/) and so staying on an old version might not be feasible, though Visual Studio is no concern here I suppose we use the one from the image).

@ntkme
Copy link
Contributor

ntkme commented May 1, 2025

I don't recall the exact numbers but extraction is unfortunately very slow on Windows

Unfortunately, this is a result of weak x64 hardware on default hosted runners. - if you look at windows-11-arm runners which currently download a full msys2 package we built from scratch, the extraction is less than 15 seconds (roughly 3x faster) with the 7z archive being twice as large.

If download is fast, but decompression is a performance concern due to weak hardware, a simple optimization can be reduce compression ratio or use archive without compression. Download size will increase but extraction will be much faster. Also keep in mind that executable files in general doesn’t compress that well, for an archive that the majority of the size are binaries, it’s probably better use a plain archive with compression rate 0 (e.g. something equivalent to a plain tar file without compression).

One thing that I need to test is that whether the slowness of compression is due to weak CPU or weak disk I/O. - If it’s weak CPU than the no compression optimization is worth doing, but if it’s weak disk I/O then not much we can do.

@ntkme
Copy link
Contributor

ntkme commented May 1, 2025

I did a quick benchmark... The issue is weak disk I/O on C: drive on x64 runners. - If we extract the same archive to D: drive it's about 5 times faster - so even if we have 2x archive size that's still about 3x faster (you can compare the msys2.7z vs msys2-arm.7z time).

The annoying inconsistency is that x64 runner has a slow C: and a fast D:, but arm64 runner only has a fast C:.

on:
  push:

jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [windows-2019, windows-2022, windows-2025, windows-11-arm]
        archive: [msys2, msys2-arm64]
        destination: [C:\msys2, D:\msys2]
        exclude:
          # arm64 runners do not have D: drive
          - os: windows-11-arm
            destination: D:\msys2
    steps:
      - name: Download ${{ matrix.archive }}.7z
        run: |
          (New-Object System.Net.WebClient).DownloadFile("https://github.com/ruby/setup-msys2-gcc/releases/download/msys2-gcc-pkgs/${{ matrix.archive }}.7z", "${{ matrix.archive }}.7z")
      - name: Extract ${{ matrix.archive }}.7z to ${{ matrix.destination }}
        run: |
          7z x ${{ matrix.archive }}.7z -o${{ matrix.destination }}

@ntkme
Copy link
Contributor

ntkme commented May 1, 2025

@eregon So regarding the decompression cost, there is an easy solution - just detect which drive $RUNNER_TEMP is in and extract to a fixed location in the same drive. - This will save about 30-40s even if we distribute a full msys2 for x64 runners.

@eregon
Copy link
Member

eregon commented May 1, 2025

Thank you for benchmarking that. Do you have a workflow run link just for curiosity?
I don't recall if we use D: or not for MSYS2 and toolchain stuff, I guess we need to use C: at least for the delta cases?

I think having the full toolchain and extracting it to the fast disk would be the best solution: simple, consistent (across windows images), reliable (less affected by the image pre-installed MSYS2), fast.

@ntkme
Copy link
Contributor

ntkme commented May 1, 2025

@eregon
Copy link
Member

eregon commented May 1, 2025

weak x64 hardware on default hosted runners

From https://docs.github.com/en/actions/using-github-hosted-runners/using-github-hosted-runners/about-github-hosted-runners#standard-github-hosted-runners-for-public-repositories given the same RAM/cores/storage it gives the impression they are the same machines or similar, and the linux ones seem pretty decent speedwise. So yeah more of a disk I/O issue as you said.

@eregon
Copy link
Member

eregon commented May 1, 2025

My only concern left about that approach is:

Though could that cause problems if two msys2 installations exist, especially if one is in a standard location and we might end up using a mix of both?
Is the preinstalled msys2 on PATH?

@ntkme
Copy link
Contributor

ntkme commented May 1, 2025

Another note, I just checked msys2/setup-msys2 actions, and found that by default it installs to $RUNNER_TEMP/msys2- I guess they figured out about the I/O performance problem...

https://github.com/msys2/setup-msys2/blob/a5d2c5a565c520efa5f477391e4e3f87c2e08f46/action.yml#L31-L34
https://github.com/msys2/setup-msys2/blob/a5d2c5a565c520efa5f477391e4e3f87c2e08f46/main.js#L347-L348

To avoid a clash, maybe we can use something like $RUNNER_TEMP/ruby-msys2?

@MSP-Greg
Copy link
Collaborator Author

MSP-Greg commented May 1, 2025

Good morning (my time). I think the distinction between C: & D: has been brought up before, possibly related to where the Ruby builds are placed. Also, whether to change ENV setting re the 'temp' folder.

A long time ago, the GHA hardware docs made clear that the Ubuntu & Windows hardware was the same, and they contained both an SSD and an HDD. If you look at the 'Physical Disk' step in windows jobs at https://github.com/MSP-Greg/actions-image-testing/actions/runs/14765640250/job/41456379175#step:6:43, you see different PhysicalSectorSize with the two disks. Typically, HDD are 4096, and SSD are 512. So, at present, C: is an HDD, D: is an SSD. There could be 'noisy neighbor' issues with both. Regardless, the SSD should always be faster (as we all know). This could change in the future.

Re moving build files to D:, I don't see an issue, we can always rename C:/msys64 and add a symlink to the D: drive location.

Also, we could then have a single file for download, instead of one for MSYS2 and one for the ucrt/mingw/clang build tools. That might help with the speed issue. 'mswin' will need an MSYS2 package, as it's installed for 'full' bash support.

This is the problem of serving the compiler tool chain as part of setup-ruby for users’ convenience. We could have said if you need tool chain run ridk install or use msys2/setup-msys2 action.

Re ridk install and other MSYS2 related commands, just no. There are very few Ruby coders that are familiar with Windows. The framework provided by setup-ruby and setup-ruby-pkgs needs to hide all of that. Many people don't have access to a Windows system, especially one with dev tools (MSYS2, MSVC/Visual Studio, and vcpkg).

Today, there are many repos that require compiling ext code, and many work on Windows with no 'Windows specific' steps in the workflow files. Many of those are running CI on both Windows MSYS2 and 'mswin' builds. I can't remember where, but recently a well respected Ruby coder stated that setup-ruby in Windows 'just works'.

Another reason to keep the code in setup-ruby is it creates a central location for fixes, whether temporary or permanent. If repo's were using ridk install or something similar, all those workflow scripts would need to change to account for the recent gcc-15 problems. As it stands, we can implement a temp fix, and soon after a permanent fix.

Off-topic history

@ntkme sorry, you may know all this, not sure...

I've been involved with 'Windows' Ruby since the conversion from MSYS (last Ruby was 2.3) to MSYS2 (started with 2.4). Back then, everyone used Travis & Appveyor. If one wanted a repo to run CI on Windows, one pretty much had to write the Appveyor script/yaml file. In addition, almost all Ruby head builds were unstable (all OS's, many not tested), and it was a mess.

Then, GHA was available (Windows 2016 & 2019), and both Benoit and I separately started working on code similar to setup-ruby. Initially GitHub staff were writing the setup code themselves. After some time, we merged our code, GitHub gave up on writing their own code, and all the code moved into the ruby org.

@ntkme
Copy link
Contributor

ntkme commented May 1, 2025

There are very few Ruby coders that are familiar with Windows.

Completely agreed. I was just bringing up a slim possibility from a historical point of view. In fact, I'd even say in general very few coders are familiar with infrastructure and operation that they only care about their code to "just work".

Off-topic history

Thanks for sharing. I only knew the GitHub part of it from https://github.com/actions/setup-ruby/blob/main/README.md.

Now we knows about these runners better, knows about corner cases like openssl/gcc incompatibility better, we can definitely make the whole workflow better after more thoughts. I have a few ideas that I will probably try to prototype later this week.

@ntkme
Copy link
Contributor

ntkme commented May 4, 2025

I have the big change ready for review. The features we've discussed in this thread are added, and much more. - Please see each PR's description for a summary of changes:

It will be somewhat controversial as it's a complete rewrite for this repository, but I believe it's better in almost every way.

As for users who are not using setup-ruby@v1 and use setup-ruby@v1.x.y or a more specific revision, they will stop receiving new updates to msys2 packages after this change, effectively making it a frozen version, but nothing will break as long as we keep the existing release tag around.

@ntkme ntkme mentioned this issue May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants