Skip to content

Commit 51637a5

Browse files
authored
Add notes on the priority formula
1 parent 1f541c0 commit 51637a5

File tree

1 file changed

+55
-0
lines changed

1 file changed

+55
-0
lines changed

docs/user-guide/scheduler.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,61 @@ command:
259259
E-mail notifications from the scheduler are not currently available
260260
on ARCHER2.
261261

262+
### Priority
263+
264+
Job priority on ARCHER2 depends on a number of different factors:
265+
266+
- The QoS your job has specified
267+
- The amount of time you have been queuing for
268+
- The number of nodes you have requested (job size)
269+
- Your current fairshare factor
270+
271+
Each of these factors is normalised to a value between 0 and 1, is multiplied
272+
with a weight and the resulting values combined to produce a priority for the job.
273+
The current job priority formula on Tursa is:
274+
275+
```
276+
Priority = [10000 * P(QoS)] + [500 * P(Age)] + [300 * P(Fairshare)] + [100 * P(size)]
277+
```
278+
279+
The priority factors are:
280+
281+
- P(QoS) - The QoS priority normalised to a value between 0 and 1. The maximum raw
282+
value is 10000 and the minimum is 0. Most QoS on ARCHER2 have a raw prioity of 500; the
283+
`lowpriority` QoS has a raw priority of 1.
284+
- P(Age) - The priority based on the job age normalised to a value between 0 and 1.
285+
The maximum raw value is 14 days (where P(Age) = 1).
286+
- P(Fairshare) - The fairshare priority normalised to a value between 0 and 1. Your
287+
fairshare priority is determined by a combination of your budget code fairshare
288+
value and your user fairshare value within that budget code. The more use that
289+
the budget code you are using has made of the system recently relative to other
290+
budget codes on the system, the lower the budget code fairshare value will be; and the more
291+
use you have made of the system recently relative to other users within your
292+
budget code, the lower your user fairshare value will be. The decay half life
293+
for fairshare on ARCHER2 is set to 14 days. [More information on the Slurm fairshare
294+
algorithm](https://slurm.schedmd.com/fair_tree.html).
295+
- P(Size) - The priority based on the job size normalised to a value between 0 and 1.
296+
The maximum size is the total number of ARCHER2 compute nodes.
297+
298+
You can view the priorities for current queued jobs on the system with the `sprio`
299+
command:
300+
301+
```
302+
auser@ln04:~> sprio -l
303+
JOBID PARTITION PRIORITY SITE AGE FAIRSHARE JOBSIZE QOS
304+
828764 standard 1049 0 45 0 4 1000
305+
828765 standard 1049 0 45 0 4 1000
306+
828770 standard 1049 0 45 0 4 1000
307+
828771 standard 1012 0 8 0 4 1000
308+
828773 standard 1012 0 8 0 4 1000
309+
828791 standard 1012 0 8 0 4 1000
310+
828797 standard 1118 0 115 0 4 1000
311+
828800 standard 1154 0 150 0 4 1000
312+
828801 standard 1154 0 150 0 4 1000
313+
828805 standard 1118 0 115 0 4 1000
314+
828806 standard 1154 0 150 0 4 1000
315+
```
316+
262317
## Troubleshooting
263318

264319
### Slurm error messages

0 commit comments

Comments
 (0)