Wednesday, March 18, 2015

Submitting Jobs From Within Jobs: qsub

In this section we are going to split the job discussed in section 4.3.4 into four separate jobs. The first job will prepare the GPFS directory and having finished its task, it will submit the second job. The second job will then generate the data file, and having done so it will submit the third job. The third job will process the data file and then it will submit the fourth job, which will will clean up and exit the sequence. The jobs are constructed to be run on the IUPUI cluster, avidd-i.iu.edu.
Here is what the first job script looks like:
[gustav@ih1 PBS]$ cat first.sh
#PBS -S /bin/bash
#PBS -N first
#PBS -o first_out
#PBS -e first_err
#PBS -q bg
#
# first.sh
#
# Prepare a directory on the AVIDD GPFS.
[ -d /N/gpfs/gustav ] || mkdir /N/gpfs/gustav
cd /N/gpfs/gustav
rm -f test
echo "/N/gpfs/gustav prepared and cleaned."
# Now submit second.sh.
ssh ih1 "cd PBS; /usr/pbs/bin/qsub second.sh"
echo "second.sh submitted."
# Exit cleanly.
exit 0
[gustav@ih1 PBS]$
The new element in this job is the line:
ssh ih1 "cd PBS; /usr/pbs/bin/qsub second.sh"
Remember that the job will not run on the head node. It will run on a computational node. But the PBS on the AVIDD cluster is configured so that you cannot submit jobs from computational nodes. So here we have to execute qsub as a remote command on the IUPUI head node ih1 by using the secure shell, since this is the only remote execution shell supported on the cluster.
The first command passed to ssh is ``cd PBS''. On having made the connection the secure shell will land me in my home directory. But I don't want to submit the job from there, because then the job output and error files will be generated in my home directory too. Instead I want all output and error files to be written on my ~/PBS subdirectory. So we go to ~/PBS first.
Then we submit the job. Observe that I use the full path name of the qsub command. The default bash configuration on the AVIDD cluster is such that the remote shell cannot find qsub otherwise. This, of course, I could fix by tweaking my own environment until it does (the PATH should normally be defined on .bashrc, not on .bash_profile), but it is a good practice to specify the full path of the command in this context anyway.
The script second.sh submitted by first.sh looks as follows:
[gustav@ih1 PBS]$ cat second.sh
#PBS -S /bin/bash
#PBS -N second
#PBS -o second_out
#PBS -e second_err
#PBS -q bg
#PBS -j oe
#
# second.sh
#
# The AVIDD GPFS directory should have been prepared by first.sh.
# Generate the data file.
cd /N/gpfs/gustav
time mkrandfile -f test -l 1000
ls -l test
echo "File /N/gpfs/gustav/test generated."
# Now submit third.sh.
ssh ih1 "cd PBS; /usr/pbs/bin/qsub third.sh"
echo "third.sh submitted."
# Exit cleanly.
exit 0
[gustav@ih1 PBS]$

No comments:

Post a Comment