Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,9 @@
[submodule "2025-HPCIC/tutorial-code/thicket-tutorial"]
path = 2025-HPCIC/tutorial-code/thicket-tutorial
url = https://github.com/llnl/thicket-tutorial
[submodule "2025-eScience/tutorial-code/thicket-tutorial"]
path = 2025-eScience/tutorial-code/thicket-tutorial
url = https://github.com/llnl/thicket-tutorial
[submodule "2025-eScience/tutorial-code/caliper-tutorial"]
path = 2025-eScience/tutorial-code/caliper-tutorial
url = https://github.com/daboehme/caliper-tutorial.git
139 changes: 139 additions & 0 deletions 2025-eScience/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
======================
eScience 2025 Tutorial
======================

This directory contains the materials for the eScience 2025 tutorial. The following subsections go over the contains of the material.

--------
Contents
--------

^^^^^^^^^^^^^
Tutorial Code
^^^^^^^^^^^^^

The code elements of this tutorial (e.g., Jupyter notebooks, command-line scripts, Markdown/RST instruction files) can all be found in the :code:`tutorial-code` subdirectory. If materials are actually stored in other git repositories, they can be accessed from this subdirectory
via a git submodule.

^^^^^^
Slides
^^^^^^

The slides used in presenting this tutorial can be found in the :code:`slides` subdirectory.

^^^^^^
Docker
^^^^^^

The Docker definition files (i.e., Dockerfiles) for all the necessary containers can be found in the :code:`docker` subdirectory. There are currently 5 definition files:

1. :code:`Dockerfile.caliper`: builds Caliper and Adiak on top of the :code:`ubuntu/noble` image from DockerHub
2. :code:`Dockerfile.thicket`: build Thicket on top of the image produced by :code:`Dockerfile.caliper`
3. :code:`Dockerfile.benchpark`: download and bootstrap Benchpark on top of the image produced by :code:`Dockerfile.benchpark`
4. :code:`Dockerfile.spawn`: download tutorial materials, download any remaining necessary packages, and do other setup work on top of the image produced by :code:`Dockerfile.benchpark`
5. :code:`Dockerfile.init`: ensure user permissions are correct using the super-minimal :code:`alpine/git` image from DockerHub

"""""""""""""""""""""""""""""""""""""""
Testing the Builds of the Docker Images
"""""""""""""""""""""""""""""""""""""""

To enable automated testing of the Docker images, all edits to the Dockerfiles above should be done in a branch with an open PR. When a PR is open, a GitHub Actions CI will
run and ensure that the images can be built. To properly configure the CI, edit the :code:`github_ci_matrix.json` file in the root of this repository as follows:

1. Edit the "tag" field to be the tag (i.e., version) of the Docker images you will be generating
2. Edit the "tutorial_dir" field to the name of this directory

The CI reads :code:`github_ci_matrix.json` to get values shared by the matrices of all GitHub Actions jobs.

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Pushing the Docker Images to GitHub Container Registry (GHCR)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

Before trying to push to GHCR, someone with the necessary permissions should make sure this repo can push to these images in GHCR (**change names when we decide on appropriate ones**):

* ghcr.io/llnl/caliper
* ghcr.io/llnl/thicket
* ghcr.io/llnl/benchpark
* ghcr.io/llnl/reproducible-benchmarking-spawn
* ghcr.io/llnl/reproducible-benchmarking-init

If these images do not yet exist, your first push will properly set the permissions. If these images do exist, follow the instructions
`here <https://docs.github.com/en/packages/learn-github-packages/configuring-a-packages-access-control-and-visibility#ensuring-workflow-access-to-your-package>`_
to add this repository to each package. Make sure to grant "Write" permissions to the repository while doing this.

After ensuring this repository has the necessary permissions, to push the Docker images to GHCR, follow these steps:

1. Make sure all changes to the Dockerfiles have been merged into the :code:`main` branch
2. From the GitHub webpage, navigate to the "Actions" tab
3. On the left of the resulting page, click on "Build containers and push to GHCR"
4. Click on the "Run workflow" button to the right of the page
5. In the popup menu that appears, select the "main" branch and fill out the requested information
6. Click the green "Run workflow" button to start the process and building and pushing images

^^^^^^^^^^^^^^
Infrastructure
^^^^^^^^^^^^^^

All the infrastructure needed to deploy the tutorial to a Kubernetes cluster with JupyterHub is contained in the :code:`infrastructure` subdirectory.
This infrastructure is generated by the tool `here <https://lc.llnl.gov/gitlab/lumsden1/hpcic-k8s-configurer>`_.
The infrastructure can be regenerated as-is using :code:`infrastructure/config.toml`.

----------------------------
Testing the Tutorial Locally
----------------------------

To test the tutorial locally, you first need to build all the Docker images except the init image. Before building,
keep in mind the following dependencies between images:

.. code-block::

ghcr.io/llnl/caliper --> ghcr.io/llnl/thicket --> ghcr.io/llnl/benchpark --> ghcr.io/llnl/reproducible-benchmarking-spawn

Because of these dependencies, the first thing you should figure out is which (if any) images you need to build locally.
If a Dockerfile has changes that are **not** on GHCR, you will need to build that image *and all downstream images (based on the flowchart above)*
locally before testing. To build an image locally, run the following from this directory (**not the** :code:`docker` **directory**):

.. code-block:: bash

$ docker build -t <image_name> -f ./docker/<dockerfile_for_image> . # Note the trailing "."

In the command above, :code:`<image_name>` should be one of the GHCR URLs above, followed by a colon, followed by a tag. It could look something
like :code:`ghcr.io/llnl/benchpark:escience-2025`. Note that :code:`<iamge_name>` **must** match the value of the :code:`FROM` directive
for the dependent image. For example, to get the :code:`<image_name>` field for :code:`ghcr.io/llnl/benchpark`, look for the :code:`FROM` directive
in :code:`./docker/Dockerfile.spawn`.

If all the changes to the corresponding Dockerfiles in :code:`docker` have already been pushed to GHCR, you do not need to build locally.
Instead, you should just pull the spawn image using:

.. code-block:: bash

$ docker pull ghcr.io/llnl/reproducible-benchmarking-spawn:<tag>

You should replace :code:`<tag>` in the command above with the GHCR tag of the image you want to pull.

After you have a built spawn image (either by building locally or by pulling from GHCR), you can run the spawn image locally
by running the following command:

.. code-block:: bash

$ docker run --rm -it --entrypoint <entrypoint> --name reproducible_benchmark_tutorial_local -p 8888:8888 <spawn_image_name>

In the command above, :code:`<spawn_image_name>` is the name of the built spawn image. If you built that image locally, this argument
should match the value you passed to the :code:`-t` flag of :code:`docker build` when building the spawn image. If you pulled the image
from GHCR, this argument should be :code:`ghcr.io/llnl/reproducible-benchmarking-spawn:<tag>`.

The :code:`<entrypoint>` field in the command above dictates what command runs within the container immediately after startup.
It can be one of three values:

1. :code:`/local-entrypoint.sh`: this entrypoint script will start a JupyterLab instance and make it available from outside the container.
2. :code:`/entrypoint.sh`: this entrypoint script will run :code:`jupyterhub-singleuser`. It is intended for use in the cloud JupyterHub deployment and should not be used locally.
3. :code:`bash`: by specifying :code:`bash` (or any other shell installed in the container), you will get command-line access to the container, instead of a Jupyter environment.

At this point, you should either have a Jupyter URL that you can use to access Jupyter, or you should have shell access to the container.
You can now do whatever local testing you want of the image.

------------------------------------
Deploying the Tutorial to Kubernetes
------------------------------------

TBA
58 changes: 58 additions & 0 deletions 2025-eScience/docker/Dockerfile.benchpark
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Copyright 2025 Lawrence Livermore National Security, LLC and other
# Benchpark developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: Apache-2.0

# For testing
# FROM test-thicket

FROM ghcr.io/llnl/thicket:hpcic-2025

USER root

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y --no-install-recommends \
wget \
gzip \
lsb-release \
patch \
tar \
unzip \
xz-utils \
zstd \
bzip2 \
liblapack-dev \
libblas-dev \
&& rm -rf /var/lib/apt/lists/*

SHELL [ "/bin/bash", "-c" ]

USER ${NB_USER}

RUN git clone https://github.com/LLNL/benchpark.git ${HOME}/benchpark && \
cd ${HOME}/benchpark && \
git checkout -b develop-2025-08-25 develop-2025-08-25 && \
git submodule update --init --recursive

USER root

RUN . /opt/global_py_venv/bin/activate && \
python3 -m pip install -r ${HOME}/benchpark/requirements.txt

RUN echo 'export PATH=${HOME}/benchpark/bin:$PATH' >> ${HOME}/.bashrc

RUN echo 'export PATH=${HOME}/benchpark/bin:$PATH' >> ${HOME}/.bash_profile

RUN chmod -R 777 ~/ ${HOME}

WORKDIR ${HOME}

RUN mkdir -p ${HOME}/.local/share && \
chmod 777 ${HOME}/.local/share

USER ${NB_USER}

# Run this to trigger bootstrap
RUN . /opt/global_py_venv/bin/activate && \
${HOME}/benchpark/bin/benchpark bootstrap
131 changes: 131 additions & 0 deletions 2025-eScience/docker/Dockerfile.caliper
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
# Copyright 2025 Lawrence Livermore National Security, LLC and other
# Benchpark developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: Apache-2.0

# FROM ubuntu:noble
FROM fluxrm/flux-sched:jammy

# ubuntu:noble added a new 'ubuntu' user in the container.
# Get rid of it!
# RUN userdel -r ubuntu

USER root

ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y --no-install-recommends \
adduser \
vim \
nano \
emacs \
build-essential \
cmake \
python3 \
python3-dev \
python3-pip \
python3-venv \
git \
util-linux \
less \
htop \
zip \
unzip \
# NOTE: the flux-sched image already pulls and builds MPICH 4.2.2
# WITHOUT PMIx support (this is important because PMIx is a pain, and
# requires extra setup with Flux).
# openmpi-bin \
# openmpi-common \
# libopenmpi-dev \
&& rm -rf /var/lib/apt/lists/*

SHELL [ "/bin/bash", "-c" ]

RUN python3 -m venv /opt/global_py_venv

RUN . /opt/global_py_venv/bin/activate && \
python3 -m pip install pybind11

ENV CALI_INSTALL_PREFIX=/usr \
GIT_CLONE_STAGING_DIR=/tmp

RUN git clone https://github.com/LLNL/Caliper.git ${GIT_CLONE_STAGING_DIR}/Caliper && \
cd ${GIT_CLONE_STAGING_DIR}/Caliper && \
git fetch origin && \
git checkout v2.12.1 && \
git submodule update --init --recursive && \
git clone https://github.com/LLNL/Adiak.git ${GIT_CLONE_STAGING_DIR}/Adiak && \
cd ${GIT_CLONE_STAGING_DIR}/Adiak && \
git fetch origin && \
git checkout v0.4.1 && \
git submodule update --init --recursive

RUN cd ${GIT_CLONE_STAGING_DIR}/Adiak && \
mkdir build && \
cd build && \
cmake \
-DENABLE_MPI=ON \
-DCMAKE_C_COMPILER=$(which gcc) \
-DCMAKE_CXX_COMPILER=$(which g++) \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=${CALI_INSTALL_PREFIX} \
.. && \
make -j 4 && \
make install

RUN . /opt/global_py_venv/bin/activate && \
cd ${GIT_CLONE_STAGING_DIR}/Caliper && \
mkdir build && \
cd build && \
cmake \
-DWITH_TOOLS=ON \
-DWITH_MPI=ON \
-DWITH_ADIAK=ON \
-DWITH_PYTHON_BINDINGS=ON \
-Dpybind11_DIR=$(pybind11-config --cmakedir) \
-DCMAKE_PREFIX_PATH=${CALI_INSTALL_PREFIX} \
-DCMAKE_C_COMPILER=$(which gcc) \
-DCMAKE_CXX_COMPILER=$(which g++) \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=${CALI_INSTALL_PREFIX} \
.. && \
make -j 4 && \
make install

RUN rm -rf ${GIT_CLONE_STAGING_DIR}/Caliper && rm -rf ${GIT_CLONE_STAGING_DIR}/Adiak

ENV NB_USER=jovyan \
NB_UID=1000 \
HOME=/home/jovyan

RUN adduser \
--disabled-password \
--gecos "Default user" \
--uid ${NB_UID} \
--home ${HOME} \
--force-badname \
${NB_USER}

# NOTE: this should NEVER be uncommented by the time we push to GHCR
RUN adduser ${NB_USER} sudo
RUN echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers

RUN chmod -R 777 ~/ ${HOME}

ENV SHELL=/usr/bin/bash

RUN mkdir -p ${HOME}/.local/share && \
chmod 777 ${HOME}/.local/share

RUN echo $(flux env)

RUN echo 'export PATH=/usr/bin:$PATH' >> ${HOME}/.bashrc && \
echo '. /opt/global_py_venv/bin/activate' >> ${HOME}/.bashrc && \
echo 'export LD_LIBRARY_PATH=/usr/lib:/usr/lib64:$LD_LIBRARY_PATH' >> ${HOME}/.bashrc

RUN echo 'export PATH=/usr/bin:$PATH' >> ${HOME}/.bash_profile && \
echo '. /opt/global_py_venv/bin/activate' >> ${HOME}/.bash_profile && \
echo 'export LD_LIBRARY_PATH=/usr/lib:/usr/lib64:$LD_LIBRARY_PATH' >> ${HOME}/.bash_profile

USER ${NB_USER}
WORKDIR ${HOME}
3 changes: 3 additions & 0 deletions 2025-eScience/docker/Dockerfile.hub
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
FROM jupyterhub/k8s-hub:4.2.0

ENV JUPYTERHUB_XSRF_ANONYMOUS_IP_CIDRS="0.0.0.0/0"
22 changes: 22 additions & 0 deletions 2025-eScience/docker/Dockerfile.init
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright 2025 Lawrence Livermore National Security, LLC and other
# Benchpark developers. See the top-level COPYRIGHT file for details.
#
# SPDX-License-Identifier: Apache-2.0

FROM alpine/git

USER root

ENV NB_USER=jovyan \
NB_UID=1000 \
HOME=/home/jovyan

RUN adduser \
-D \
-g "Default user" \
-u ${NB_UID} \
-h ${HOME} \
${NB_USER}

COPY ./docker/init-entrypoint.sh /entrypoint.sh
RUN chmod 777 /entrypoint.sh
Loading