Skip to content

Update job polling mechanism #115

Open
@charles-cowart

Description

@charles-cowart

polling job state using squeue instead of sacct would be preferred (Jeff); it's more accurate and faster to update than sacct, which can take up to ten minutes to update.

https://hpc-unibe-ch.github.io/slurm/monitoring-jobs.html

Job status will live on in squeue for five minutes after a job exits (Jeff), so catching the completion or erroring of a job shouldn't be a problem.

Also, we should adjust the polling mechanism (Job._system()) to handle other conditions that don't get checked for by SPP like OUT OF MEMORY error. See:

https://slurm.schedmd.com/squeue.html#SECTION_JOB-STATE-CODES

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions