Skip to main content

How to submit a job that uses TensorFlow

ThisWith this tutorial you will showbe howable to submit a job that uses TensorFlow to the batch cluster.

Use

The alreadyfollowing availablesteps image:

allow the user to execute a Python script that uses TensorFlow and other Python libraries.

Copy the project folder to the cluster

[mcastro@fedora ~]$ scp -r -J mcastro@fw03 /home/mcastro/my_project/ mcastro@cirrus01
tf_run_job]

Access the cluster

[mcastro@fedora ~]$ ssh mcastro@cirrus01

Clone the reference repository

[mcastro@cirrus01]$ git clone https://gitlab.com/lip-computing/computing/tf_run_job.git

Submit the job with the Python script inside project folder. In this example, the datasets are in my_datasets subfolder.

[mcastro@cirrus01]$ cd my_project 
[mcastro@cirrus01 my_project]$ sbatch ~/tf_run_job/run_job --input ~/my_project/my_python_script.py --filesfile ~/my_project/my_datasets/dataset1.csv ~/my_project/my_datasets/dataset2.csv

kjjjjjOnce the job is completed the console log with program messages will be written to a folder in the user's home directory.

Use
[mcastro@cirrus01 yourmy_project]$ owncat image:
slurm-124811.out * ---------------------------------------------------------------- * Running PROLOG for run_job on Tue Nov 17 17:22:01 WET 2020 * PARTITION : gpu * JOB_NAME : run_job * JOB_ID : 124811 * USER : mcastro * NODE_LIST : hpc050 * SLURM_NNODES : 1 * SLURM_NPROCS : * SLURM_NTASKS : * SLURM_JOB_CPUS_PER_NODE : 1 * WORK_DIR : /users/hpc/mcastro/my_project * ---------------------------------------------------------------- Info: deleting container: 61fb9513-b33d-3b7f-85ed-25db26202b61 7f5d9200-712f-3134-a470-defdffb21e81 Warning: non-existing user will be created ############################################################################## # # # STARTING 7f5d9200-712f-3134-a470-defdffb21e81 # # # ############################################################################## executing: bash Results available on workdir: /home/hpc/mcastro/Job.ZlV3RW

Any additional support for this procedure or to use different requirements for the provided TensorFlow docker image, just contact helpdesk@incd.pt.