[Docker]: Retooled the Dockerfile to use a base image from r-minimal.#108
[Docker]: Retooled the Dockerfile to use a base image from r-minimal.#108
Conversation
|
Docker image stuff is where I must defer to @colinleach. The one thing I can comment on is that the |
|
I'm not into the detail of this yet, but building Tidyverse stuff takes for ever. It relies on g++ to compile what seems like millions of lines of C++ source. I've seen 13 mins or more to install it locally (recent-model Ryzen 9, 32 GB RAM, high-end NVMe drive). GitHub CI doesn't cache the build between runs. |
|
This image takes forever to build from scratch. But from testing on my machine, it takes very little time once the initial image gets done. I think the time is because everything has to be manually compiled and built ... and then all the compilers and other things like RStudio and docs and a bunch of other stuff is removed. |
|
This sort of size reduction (2.87 GB to 209 MB for my local images) instinctively feels too good to be true! However, so far it has passed every test I've thrown at it. For comparison, I looked at my ~/R/ directory where R itself and all the packages are installed. That's 341.7 MB on disk, so substantially larger than the entire docker image. However, the biggest part of that is the I'll give my approval for now, and keep my fingers crossed as I keep testing... |
|
Still looking good! In case anyone wonders which R packages are available in the image, this is a listing of There are things there we don't directly need, but I suspect they may be dependencies of important stuff, so removing them will break stuff. |
|
Happy that the testing is going well! Thrilled that we might be on the right track. I couldn't sleep, so I did some rough research early this morning. There looks to be some techniques for caching binaries and layers to speed up build times for images. I am going to do some reading and see if any of it makes a difference, 4min build + test seems bearable, but 8+ min is ..... sorta ugly for CI if you are going to be iterating on a bunch of changes. And it would be really really nice if CI didn't take more than 2min or less. Here are some resources I've found:
|
|
No big surprise, but I also confirmed the image uses the latest : |
|
It would appear that using a prebuilt binary is the way to cut down on build time. Small problem: both Posit and CRAN do these with We can do our own...but .... details, details. Sigh. |
|
I'm not sure that maintaining our own binaries is really more appealing than accepting 8-minute builds? Base R is pretty stable, but there's quite a bit of development on the dozens of packages we're installing. |
|
Yeah. Maintaining the binaries would be ... ugh. And I think that we are building and pushing the image to a repo for production - so the time sink is here in the test-runner repo. |
|
Assuming that @depial is now happy with the v3 test runner, we might only be rebuilding the docker image every month or two, for dependabot stuff. I can't get too upset about something I can walk away from for a few minutes while it runs. Though that reminds me, I need to check how dependabot is set up. It regularly updates the Julia test runner with new versions, I think R is less active. |
|
ooh. so tempted to try and rip out googledrive and googlesheets from this. But it also ... works? So maybe we don't. |
Yep! I tried to get the important refactoring out of the way in the last PR for |
A thought I already had! At least they're only 2.2MB and 0.7MB respectively, and I suspect something will break if we remove them. The Tidyverse collection is designed to all work together, and there is a complex web of dependencies. |
Wanna write analysis rules for Python exercises? (JOKING) |
I should note that this is really quite the innovation in the R world! Traditionally, things just sort of ... happened |
|
@colinleach @depial - I'll let one of the two of you do the honors...... 😄 |
|
Bethany, this is your (non-trivial!) achievement. Press the button! |
|
Pushing the button....hoping we don't have to revert...... |
|
YAY it works!! 🎉 Putting a note here so we don't forget. One of the major changes that |
|
Something to explore when we get to writing a I'm currently writing up string stuff, some of which is locale-dependent (things like upper/lower case). I don't know if that's also impacted. |
|
The best place to check is at the |
Examples of how it can be adapted for R and Python: exercism/r@main...keiravillekode:r:verify-exercises-in-docker exercism/python@main...keiravillekode:python:verify-exercises-in-docker |
Thank you! I think we should definitely set something up for R, and your script looks like a good place to start. The Python track already does this in the content repo CI as well as the Runner CI. 😄 And the runner repo has a One of the issues with verifying exercises in the R CI is that currently the image takes 8 minutes to build, which is an eternity. So then waiting to verify all the exercises is just .... mean. 😂 But triggering things manually with a |
I think this succeeds in getting the image down to ~210mb. But we probably want to do more testing than the tests in
run-tests-in-docker.sh.Let me know if there are more libs needed than the ones listed. More information on
r-minimalcan be found here.Not quite sure it used the latest image which uses R 5.4.2, so we can be explicit about it if needed.