|
| 1 | +# Deploy hpdc-2025-c7i-24xlarge to AWS Elastic Kubernetes Service (EKS) |
| 2 | + |
| 3 | +These config files and scripts can be used to deploy the hpdc-2025-c7i-24xlarge tutorial to EKS. |
| 4 | + |
| 5 | +The sections below walk you through the steps to deploying your cluster. All commands in these |
| 6 | +sections should be run from the same directory as this README. |
| 7 | + |
| 8 | +## Step 1: Create EKS cluster |
| 9 | + |
| 10 | +To create an EKS cluster with your configured settings, run the following: |
| 11 | + |
| 12 | +```bash |
| 13 | +$ ./create_cluster.sh |
| 14 | +``` |
| 15 | + |
| 16 | +Be aware that this step can take upwards of 15-30 minutes to complete. |
| 17 | + |
| 18 | +## Step 2: Configure Kubernetes within the EKS cluster |
| 19 | + |
| 20 | +After creating the cluster, we need to configure Kubernetes and its addons. In particular, |
| 21 | +we need to setup the Kubernetes autoscaler, which will allow our tutorial to scale to as |
| 22 | +many users as our cluster's resources can possibly handle. |
| 23 | + |
| 24 | +To configure Kubernetes and the autoscaler, run the following: |
| 25 | + |
| 26 | +```bash |
| 27 | +$ ./configure_kubernetes.sh |
| 28 | +``` |
| 29 | + |
| 30 | +## Step 3: Deploy JupyterHub to the EKS cluster |
| 31 | + |
| 32 | +With the cluster properly created and configured, we now can deploy JupyterHub to the cluster |
| 33 | +to manage everything else about our tutorial. |
| 34 | + |
| 35 | +To deploy JupyterHub, run the following: |
| 36 | + |
| 37 | +```bash |
| 38 | +$ ./deploy_jupyterhub.sh |
| 39 | +``` |
| 40 | + |
| 41 | +## Step 4: Verify that everything is working |
| 42 | + |
| 43 | +After deploying JupyterHub, we need to make sure that all the necessary components |
| 44 | +are working properly. |
| 45 | + |
| 46 | +To check this, run the following: |
| 47 | + |
| 48 | +```bash |
| 49 | +$ ./check_jupyterhub_status.sh |
| 50 | +``` |
| 51 | + |
| 52 | +If everything worked properly, you should see an output like this: |
| 53 | + |
| 54 | +``` |
| 55 | +NAME READY STATUS RESTARTS AGE |
| 56 | +continuous-image-puller-2gqrw 1/1 Running 0 30s |
| 57 | +continuous-image-puller-gb7mj 1/1 Running 0 30s |
| 58 | +hub-8446c9d589-vgjlw 1/1 Running 0 30s |
| 59 | +proxy-7d98df9f7-s5gft 1/1 Running 0 30s |
| 60 | +user-scheduler-668ff95ccf-fw6wv 1/1 Running 0 30s |
| 61 | +user-scheduler-668ff95ccf-wq5xp 1/1 Running 0 30s |
| 62 | +``` |
| 63 | + |
| 64 | +Be aware that the hub pod (i.e., hub-8446c9d589-vgjlw above) may take a minute or so to start. |
| 65 | + |
| 66 | +If something went wrong, you will have to edit the config YAML files to get things working. Before |
| 67 | +trying to work things out yourself, check the FAQ to see if your issue has already been addressed. |
| 68 | + |
| 69 | +Depending on what file you edit, you may have to run different commands to update the EKS cluster and |
| 70 | +deployment of JupyterHub. Follow the steps below to update: |
| 71 | +1. If you only edited `helm-config.yaml`, try to just update the deployment of Jupyterhub by running `./update_jupyterhub_deployment.sh` |
| 72 | +2. If step 1 failed, fully tear down the JupyterHub deployment with `./tear_down_jupyterhub.sh` and then re-deploy it with `./deploy_jupyterhub.sh` |
| 73 | +3. If you edited `cluster-autoscaler.yaml` or `storage-class.yaml`, tear down the JupyterHub deployment with `./tear_down_jupyterhub.sh`. Then, reconfigure Kubernetes with `./configure_kubernetes.sh`, and re-deploy JupyterHub with `./deploy_jupyterhub.sh` |
| 74 | +4. If you edited `eksctl-config.yaml`, fully tear down the cluster with `cleanup.sh`, and then restart from the top of this README |
| 75 | + |
| 76 | +## Step 5: Get the public cluster URL |
| 77 | + |
| 78 | +Now that everything's ready to go, we need to get the public URL to the cluster. |
| 79 | + |
| 80 | +To do this, run the following: |
| 81 | + |
| 82 | +```bash |
| 83 | +$ ./get_jupyterhub_url.sh |
| 84 | +``` |
| 85 | + |
| 86 | +Note that it can take several minutes after the URL is available for it to actually redirect |
| 87 | +to JupyterHub. |
| 88 | + |
| 89 | +## Step 6: Distribute URL and password to attendees |
| 90 | + |
| 91 | +Now that we have our pulbic URL, we can give the attendees everything they need to join the tutorial. |
| 92 | + |
| 93 | +For attendees to access JupyterHub, they simply need to enter the public URL (from step 5) in their browser of choice. |
| 94 | +This will take them to a login page. The login credentials are as follows: |
| 95 | +* Username: anything the attendee wants (note: this should be unique for every user. Otherwise, users will share pods.) |
| 96 | +* Password: the password specified towards the top of `helm-config.yaml` |
| 97 | + |
| 98 | +Once the attendees log in with these credentials, the Kubernetes autoscaler will spin up a pod for them (and grab new |
| 99 | +resources, if needed). This pod will contain a JupyterLab instace with the tutorial materials and environment already |
| 100 | +prepared for them. |
| 101 | + |
| 102 | +At this point, you can start presenting your interactive tutorial! |
| 103 | + |
| 104 | +## Step 7: Cleanup everything |
| 105 | + |
| 106 | +Once you are done with your tutorial, you should cleanup everything so that there are not continuing, unneccesary expenses |
| 107 | +to your AWS account. To do this, simply run the following: |
| 108 | + |
| 109 | +```bash |
| 110 | +$ ./cleanup.sh |
| 111 | +``` |
| 112 | + |
| 113 | +After cleaning everything up, you can verify that everything has been cleaned up by going to the AWS web consle |
| 114 | +and ensuring nothing from your tutorial still exists in CloudFormation and EKS. |
0 commit comments