Skip to main content

How to submit a job that uses TensorFlow

With this tutorial you will be able to submit a job that uses TensorFlow to the batch cluster.

The following steps allow the user to execute a Python script that uses TensorFlow and other Python libraries.

Copy the project folder to the cluster

[user@fedora ~]$ scp -r -J user@fw03 /home/user/my_project/ user@cirrus01

Access the cluster

[mcastro@fedorauser@fedora ~]$ ssh mcastro@cirrus01user@cirrus01

Clone the reference repository

[user@cirrus01]$ git clone https://gitlab.com/lip-computing/computing/tf_run_job.git

Submit the job with the Python script inside project folder. In this example, the datasets are in my_datasets subfolder.

[user@cirrus01]$ cd my_project 
[user@cirrus01 my_project]$ sbatch ~/tf_run_job/run_job --input my_python_script.py --file my_datasets/dataset1.csv my_datasets/dataset2.csv

Once the job is completed the console log with the program messages will be written to a folder in the user's home directory.

[user@cirrus01 my_project]$ cat slurm-124811.out 
* ----------------------------------------------------------------
* Running PROLOG for run_job on Tue Nov 17 17:22:01 WET 2020
*    PARTITION               : gpu
*    JOB_NAME                : run_job
*    JOB_ID                  : 124811
*    USER                    : user
*    NODE_LIST               : hpc050
*    SLURM_NNODES            : 1
*    SLURM_NPROCS            : 
*    SLURM_NTASKS            : 
*    SLURM_JOB_CPUS_PER_NODE : 1
*    WORK_DIR                : /users/hpc/user/my_project
* ----------------------------------------------------------------
Info: deleting container: 61fb9513-b33d-3b7f-85ed-25db26202b61
7f5d9200-712f-3134-a470-defdffb21e81
Warning: non-existing user will be created
 
 ############################################################################## 
 #                                                                            # 
 #               STARTING 7f5d9200-712f-3134-a470-defdffb21e81                # 
 #                                                                            # 
 ############################################################################## 
 executing: bash
Results available on workdir: /home/hpc/user/Job.ZlV3RW

Any additional support for this procedure or to use different requirements for the provided TensorFlow docker image, just contact helpdesk@incd.pt.