You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: website/software_install.md
+73-57Lines changed: 73 additions & 57 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -193,7 +193,7 @@ which will launch Julia with as many threads are there are cores on your machine
193
193
### Julia on GPUs
194
194
The [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl) module permits to launch compute kernels on Nvidia GPUs natively from within Julia. [JuliaGPU](https://juliagpu.org) provides further reading and [introductory material](https://juliagpu.gitlab.io/CUDA.jl/tutorials/introduction/) about GPU ecosystems within Julia.
195
195
196
-
<!--
196
+
197
197
### Julia MPI
198
198
The following steps permit you to install [MPI.jl](https://github.com/JuliaParallel/MPI.jl) on your machine and test it:
199
199
1. If Julia MPI is a dependency of a Julia project MPI.jl should have been added upon executing the `instantiate` command from within the package manager [see here](#package_manager). If not, MPI.jl can be added from within the package manager (typing `add MPI` in package mode).
@@ -229,10 +229,10 @@ and add `-host localhost` to the execution script:
229
229
```sh
230
230
$ mpiexecjl -n 4 -host localhost julia --project ./hello_mpi.jl
231
231
```
232
-
}-->
232
+
}
233
233
234
-
<!--
235
-
For running Julia at scale on Piz Daint, refer to the [Julia MPI GPU on Piz Daint](#julia_mpi_gpu_on_piz_daint) section.
234
+
235
+
<!--For running Julia at scale on Piz Daint, refer to the [Julia MPI GPU on Piz Daint](#julia_mpi_gpu_on_piz_daint) section.-->
236
236
237
237
## GPU computing on Piz Daint
238
238
@@ -261,37 +261,23 @@ ssh <username>@ela.cscs.ch
261
261
3. Generate a `ed25519` keypair as described in the [CSCS user website](https://user.cscs.ch/access/auth/#generating-ssh-keys-if-not-required-to-provide-a-2nd-factor). On your local machine (not ela), do `ssh-keygen` leaving the passphrase empty. Then copy your public key to the remote server (ela) using `ssh-copy-id`. Alternatively, you can copy the keys manually as described in the [CSCS user website](https://user.cscs.ch/access/auth/#generating-ssh-keys-if-not-required-to-provide-a-2nd-factor).
4. Edit your ssh config file located in `~/.ssh/config` and add following entries to it, making sure to replace `<username>` and key file with correct names, if needed:
269
268
```sh
270
-
Host ela
271
-
HostName ela.cscs.ch
272
-
User <username>
273
-
IdentityFile ~/.ssh/id_ed25519
274
-
275
-
Host daint
269
+
Host daint-xc
276
270
HostName daint.cscs.ch
277
271
User <username>
278
272
IdentityFile ~/.ssh/id_ed25519
279
-
ProxyJump ela
280
-
ForwardAgent yes
281
-
RequestTTY yes
282
-
283
-
Host nid*
284
-
HostName %h
285
-
User <username>
286
-
IdentityFile ~/.ssh/id_ed25519
287
-
ProxyJump daint
273
+
ProxyJump <username>@ela.cscs.ch
274
+
AddKeysToAgent yes
288
275
ForwardAgent yes
289
-
RequestTTY yes
290
276
```
291
277
292
278
5. Now you should be able to perform password-less login to daint as following
293
279
```sh
294
-
ssh daint
280
+
ssh daint-xc
295
281
```
296
282
Moreover, you will get the Julia related modules loaded as we add the `RemoteCommand`
297
283
@@ -305,21 +291,18 @@ ln -s $SCRATCH scratch
305
291
```
306
292
}
307
293
308
-
\warn{There is interactive visualisation on daint. Make sure to produce `png` or `gifs`. Also to avoid plotting to fail, make sure to set the following `ENV["GKSwstype"]="nul"` in the code. Also, it may be good practice to define the animation directory to avoid filling a `tmp`, such as
Make sure to remove any folders you may find in your scratch as those are the empty remaining from last year's course.
295
+
296
+
### Setting up Julia on Piz Daint
297
+
298
+
The Julia setup on Piz Daint is handled by [JUHPC](https://github.com/JuliaParallel/JUHPC). Everything should be ready for use and the only step required is to activate the environment mostly each time before launching Julia. Also, **only hte first time**, `juliaup` needs to be installed (these steps are explained hereafter).
316
299
317
300
### Running Julia interactively on Piz Daint
318
-
So now, how do we actually run some GPU Julia code on Piz Daint?
301
+
To access a GPU on Piz Daint.
319
302
320
303
1. Open a terminal (other than from within VS code) and login to daint:
321
304
```sh
322
-
ssh daint
305
+
ssh daint-xc
323
306
```
324
307
325
308
2. The next step is to secure an allocation using `salloc`, a functionality provided by the SLURM scheduler. Use `salloc` command to allocate one node (`N1`) and one process (`n1`) on the GPU partition `-C'gpu'` on the project `class04` for 1 hour:
👉 *Running **remote job** instead? [Jump right there](#running_a_remote_job_on_piz_daint)*
333
316
334
-
3. Make sure to remember the **node number** returned upon successful allocation, e.g., `salloc: Nodes nid02145 are ready for job`
335
-
336
-
4. Once you have your allocation (`salloc`) and the node (here `nid02145`), you can access the compute node by using the following `srun` command followed by loading the required modules:
317
+
3. Once you have your allocation (`salloc`) and the node, you can access the compute node by using the following `srun` command:
- In the command bar of VS code (`cmd + shit + P` on macOS, `ctrl + shift + P` on Windows), type `Remote-SSH: Connect to Host...`. Accept what should be accepted and continue. Then type in the node and id (node number) as from previous step (here `nid02145`). Upon hitting enter, you should be on the node with Julia environment loaded.
322
+
4. Then, to "activate" the Julia configuration previously prepared, enter the following (do not miss the first dot `.`):
323
+
```sh
324
+
.$SCRATCH/../julia/daint-gpu-nocudaaware/activate
325
+
```
326
+
This will activate the artifact-based config for CUDA.jl which works smoother on the rather old Nvidia P100 GPUs. The caveat is that it does not allow for CUDA-aware MPI. It exists also a CUDA-aware `daint-gpu` configuration one could try out at later stage but may not be totally stable.
343
327
344
-
5. You should then be able to launch Julia
328
+
5.Then, **only the first time**, you need to install Julia using the [`juliaup`](https://github.com/JuliaLang/juliaup) command:
345
329
```sh
346
-
julia
330
+
juliaup
347
331
```
332
+
This will install latest Julia, upon JUHPC calling into juliaup.
348
333
349
-
#### :eyes: ONLY the first time
350
-
1. Assuming you are on a node and launched Julia. To finalise your install, enter the package manager and query status `] st` and `add CUDA@v4`.
334
+
6. Next, go to the scratch and create a temporary test dir
335
+
```sh
336
+
cd$SCRATCH
337
+
mkdir tmp-test
338
+
cd tmp-test
339
+
touch Project.toml
340
+
```
351
341
352
-
\warn{Because some driver discovery compatibility issues, you need to add specifically version 4 of CUDA.jl, upon typing `add CUDA@v4` in the package mode.}
342
+
7. You should then be able to launch Julia in the `tmp-test` project environment
343
+
```sh
344
+
julia --project=.
345
+
```
353
346
347
+
8. Within Julia, enter the package mode, check the status, and add any package you'd like to be part of `tmp-test`. Let's here add `CUDA` and `MPI`, as these two packages will be mostly used in the course.
354
348
```julia-repl
355
-
(@1.9-daint-gpu) pkg> st
356
-
Installing known registries into `/scratch/snx3000/class230/../julia/class230/daint-gpu`
357
-
Status `/scratch/snx3000/julia/class230/daint-gpu/environments/1.9-daint-gpu/Project.toml` (empty project)
349
+
julia> ]
350
+
351
+
(tmp-test) pkg> st
352
+
Installing known registries into `/scratch/snx3000/class230/../julia/class230/daint-gpu-nocudaaware/juliaup/depot`
353
+
Added `General` registry to /scratch/snx3000/class230/../julia/class230/daint-gpu-nocudaaware/juliaup/depot/registries
354
+
Status `/scratch/snx3000/class230/tmp-test/Project.toml` (empty project)
358
355
359
-
(@1.9-daint-gpu) pkg> add CUDA@v4
356
+
(tmp-test) pkg> add CUDA, MPI
360
357
```
361
358
362
-
2. Then load it and query version info
359
+
9. Then load it and query version info
363
360
```julia-repl
364
361
julia> using CUDA
365
362
366
363
julia> CUDA.versioninfo()
367
-
CUDA runtime 11.0, local installation
368
-
CUDA driver 12.1
369
-
NVIDIA driver 470.57.2, originally for CUDA 11.4
364
+
CUDA runtime 11.8, artifact installation
365
+
CUDA driver 12.6
366
+
NVIDIA driver 470.57.2
367
+
368
+
#[skipped lines]
369
+
370
+
Preferences:
371
+
- CUDA_Runtime_jll.version: 11.8
372
+
- CUDA_Runtime_jll.local: false
373
+
374
+
1 device:
375
+
0: Tesla P100-PCIE-16GB (sm_60, 15.897 GiB / 15.899 GiB available)
370
376
```
371
377
372
-
3. Try out your first calculation on the P100 GPU
378
+
10. Try out your first calculation on the P100 GPU
373
379
```julia-repl
374
380
julia> a = CUDA.ones(3,4);
375
381
@@ -382,11 +388,20 @@ julia> c .= a .+ b
382
388
383
389
If you made it to here, you're all set 🚀
384
390
391
+
\warn{There is interactive visualisation on daint. Make sure to produce `png` or `gifs`. Also to avoid plotting to fail, make sure to set the following `ENV["GKSwstype"]="nul"` in the code. Also, it may be good practice to define the animation directory to avoid filling a `tmp`, such as
\note{You can also use VS code's integrated terminal to launch Julia on daint. However, you can't use the Julia extension nor the direct node login and would have to use `srun -n1 --pty /bin/bash -l` and load the needed modules, namely `module load daint-gpu Julia/1.9.3-CrayGNU-21.09-cuda`.}
426
+
\note{You can also use VS code's integrated terminal to launch Julia on daint. However, you can't use the Julia extension and would have to use `srun -n1 --pty /bin/bash -l` and activate the environment.}
412
427
413
428
### Running a remote job on Piz Daint
414
429
If you do not want to use an interactive session you can use the `sbatch` command to launch a job remotely on the machine. Example of a `submit.sh` you can launch (without need of an allocation) as `sbatch submit.sh`:
@@ -424,10 +439,10 @@ If you do not want to use an interactive session you can use the `sbatch` comman
For convenience it is suggested to also symlink to the home-directory `ln -s ~/mnt/daint/users/<your username on daint> ~/mnt/daint_home`. (Note that we mount the root directory `/` with `sshfs` such that access to `/scratch` is possible.)
465
480
481
+
<!--
466
482
### Julia MPI GPU on Piz Daint
467
483
The following step should allow you to run distributed memory parallelisation application on multiple GPU nodes on Piz Daint.
468
484
1. Make sure to have the Julia GPU environment loaded
0 commit comments