Job pipeline using slurm dependencies

Some times we need to launch a list of jobs that execute in sequence, one after another. In those cases we will use the --depency sbatch option, check the manual page for more details, we will only present a simple example.

Simple example

Suppose we need to submit the script my_first_job.sh and then mu_second_job.sh that should run after the first one:

[user@cirrus01 ~]$ sbatch my_first_job.sh
Submitted batch job 1843928

[user@cirrus01 ~]$ sbatch --dependency=after:1843928 my_second_job.sh
Submitted batch job 1843921

[user@cirrus01 ~]$ squeue
JOBID   PARTITION              NAME USER ST TIME NODES NODELIST(REASON) 
1843928       hpc   my_first_job.sh user  R 0:11     1 hpc046
1843921       hpc  my_second_job.sh user PD 0:00     1 hpc047

In this case the second job will run even if the first job fails for some reason. The pending job will execute when the first finish his execution.

Tipical example

On a real case we may need the ensure that a good termination of the first job, for example, the first job may produce some output file needed as input for the second job:

[user@cirrus01 ~]$ sbatch my_first_job.sh
Submitted batch job 1843922

[user@cirrus01 ~]$ sbatch --dependency=afterok:1843922 my_second_job.sh
Submitted batch job 1843923

The afterok parameter states that the second job would start only if the previous job terminate with no errors.

Complex cases

Check the sbatch manual page for more details:

 [user@cirrus01 ~]$ man sbatch

search for the -d, --dependency=<dependency_list> options explanation.


Revision #1
Created Wed, Aug 4, 2021 7:45 PM by João Paulo Martins
Updated Wed, Aug 4, 2021 8:17 PM by João Paulo Martins