The Sun Grid Engine is a queue and scheduler that accepts jobs and runs them on the cluster for the user. There are three types of jobs available, interactive, batch, parallel.
It is assumed you are logged into the cluster and know how to create and edit files, etc. It should also be noted that one should never assume a program to be on the path, and as such one should always call programs by their full path name(ie /usr/local/bin/xxxx). This also helps when a script doesn't work and needs to be debugged.
Running Batch(Serial) Jobs with SGE:
A batch(or serial) job is one that is run on a single node. This is in contrast to the case where a single job is run on many nodes in an interconnected fashion, generally using MPI to communicate in between individual processes. If you are running the same program on the cluster as you would on your desktop, chances are you will want to use a serial job.
Some things to keep in mind when creating jobs is your directory structure. Its a good idea to organize files needed for a job into a single folder. If there are read-only files needed by mutiple jobs, using symlinks is a good idea so there are no duplicate files taking up extra space. An example of a good directory structure could be:
Project1/ Project1/jobA Project1/jobB
In this example, we will run a matlab script:
- Create a directory to hold your job file and any associated data(matlab scripts, etc).
- Open a new file, in this case we will call it matlab-test.job
#!/bin/bash # The name of the job, can be anything, simply used when displaying the list of running jobs #$ -N matlab-test # Giving the name of the output log file #$ -o matlabTest.log # Combining output/error messages into one file #$ -j y # One needs to tell the queue system to use the current directory as the working directory # Or else the script may fail as it will execute in your top level home directory /home/username #$ -cwd # Now comes the commands to be executed /share/apps/matlab/bin/matlab -nodisplay -nodesktop -nojvm -r matlab-test # Note after -r is not the name of the m-file but the name of the routine exit 0
- Save this job script and submit to the queue with
- Now you can check the status of your script with “qstat” which will return a list of your running/queued jobs
When the job is completed you can check the output of the job in the filename given above, matlabTest.log NOTE:You may see the following in the output
“Warning: no access to tty (Bad File descriptor). Thus no job control in this shell.”
This is normal and can be ignored. And in the case of matlab, you may see a message about shopt, again for matlab this is normal and can be ignored. Attached is the sample job and matlab script.
Running Interactive Jobs with SGE:
An interactive job is when you are running a program interactively on a node. This is good in the case of building/testing scripts, etc. This is not the place to run long running, very computationally intensive, or other jobs better suited to run in a batch job. An example would be the development of a matlab script. You can launch an interactive job, develop the script and write the job file. But when it comes to running the job itself, it needs to be submitted as a batch job. To run an interactive job, simply type
Running Parallel Jobs with SGE: A parallel job is where a single job is run on many nodes in an interconnected fashion, generally using MPI to communicate in between individual processes. If you are running the same program on the cluster as you would on your desktop, chances are you will want to use a serial job, not a parallel job. Parallel jobs generally are only for specially designed programs which will only work on machines with cluster management software installed.
Also not just any program can run in parallel, it must be programmed as such and compiled against a particular mpi library. In this case we build a simply program that passes a message between processes and compile it against the OpenMPI, the main mpi library of the cluster.
Also note that the scheduler will only accept parallel jobs between 4 to 8 slots. It is currently setup to start parallel processes on a single node to limit the overhead of inter-process communication over the network, which adds considerable run time to the job. For most jobs, more slots is not always best
- Like the batch job, create a directory to hold this job and related files
- Open a new file and create the job script:
#!/bin/bash#$ -N openmpi-test # Here we tell the queue that we want the orte parallel enivironment and request 4 slots # This option take the following form: -pe nameOfEnv min-Max # Where you request a min and max number of slots #$ -pe orte 4-8 # For parallel jobs, its a good idea to use even numbers. #$ -cwd #$ -j y mpirun -n $NSLOTS mpi-ring exit 0
- And like above you can use qsub to check on your job
NOTES:There are a few queue commands to know
- List all jobs running “qsub -u \*”
- List all jobs running per node “qsub -u \* -f”
- To delete a job “qdel jobID”
- To list any queue messages “qstat -j”
- Should a job be marked for deletion but stay in the queue for a while, contact CBI.
- There is a known bug in the scheduler that sometimes causes it to not responds resulting in the following message:
commlib error: got select error (Connection refused) unable to send message to qmaster using port 536 on host "cheetah.cbi.utsa.edu": got send error
This can be safely ignored. Simply wait a minitue and retry your command again.
SGE Environment Options And Environment Variables:
When a Sun Grid Engine job is run, a number of variables are preset into the job’s script environment, as listed below.
- ARC - The Sun Grid Engine architecture name of the node on which the job is running; the name is compiled-in into the sge_execd binary
- SGE_ROOT - The Sun Grid Engine root directory as set for sge_execd before start-up or the default /usr/SGE
- SGE_CELL - The Sun Grid Engine cell in which the job executes
- SGE_JOB_SPOOL_DIR - The directory used by sge_shepherd(8) to store jobrelated data during job execution
- SGE_O_HOME - The home directory path of the job owner on the host from which the job was submitted
- SGE_O_HOST - The host from which the job was submitted
- SGE_O_LOGNAME - The login name of the job owner on the host from which the job was submitted
- SGE_O_MAIL - The content of the MAIL environment variable in the context of the job submission command
- SGE_O_PATH - The content of the PATH environment variable in the context of the job submission command
- SGE_O_SHELL - The content of the SHELL environment variable in the context of the job submission command
- SGE_O_TZ - The content of the TZ environment variable in the context of the job submission command
- SGE_O_WORKDIR - The working directory of the job submission command
- SGE_CKPT_ENV - Specifies the checkpointing environment (as selected with the qsub -ckpt option) under which a checkpointing job executes
- SGE_CKPT_DIR - Only set for checkpointing jobs; contains path ckpt_dir (see the checkpoint manual page) of the checkpoint interface
- SGE_STDERR_PATH - The path name of the file to which the standard error stream of the job is diverted; commonly used for enhancing the output with error messages from prolog, epilog, parallel environment start/stop or checkpointing scripts
- SGE_STDOUT_PATH - The path name of the file to which the standard output stream of the job is diverted; commonly used for enhancing the output with messages from prolog, epilog, parallel environment start/stop or checkpointing scripts
- SGE_TASK_ID - The task identifier in the array job represented by this task
- ENVIRONMENT - Always set to BATCH; this variable indicates that the script is run in batch mode
- HOME - The user’s home directory path from the passwd file
- HOSTNAME - The host name of the node on which the job is running
- JOB_ID - A unique identifier assigned by the sge_qmaster when the job was submitted; the job ID is a decimal integer in the range to 99999
- JOB_NAME - The job name, built from the qsub script filename, a period, and the digits of the job ID; this default may be overwritten by qsub -N
- LOGNAME - The user’s login name from the passwd file
- NHOSTS - The number of hosts in use by a parallel job
- NQUEUES - The number of queues allocated for the job (always 1 for serial jobs)
- NSLOTS - The number of queue slots in use by a parallel job
Using other MPI Environments: Besides the default mpi environment for openmpi, mpich2 is installed on the system at /opt/mpich2/gnu. To setup your environment to use mpich2 instead of openmpi, you'll have to alter your shell environment. To do so, use your text editor to edit /home/username/.bash_profile and add the following:
export PATH=/opt/mpich2/gnu/bin:$PATH export LD_LIBRARY_PATH=/opt/mpich2/gnu/lib:$LD_LIBRARY_PATH export LD_RUN_PATH=/opt/mpich2/gnu/lib:$LD_RUN_PATH
This adds mpich2 to the path and to the library path. When compiling programs, be sure to tell the configure script where mpicc/mpif90/etc are located by using the full path. Launching an mpich2 job: The job script is similar, but includes a few extra directives needed for mpich2
#!/bin/bash #$ -N jobName #$ -cmd #$ -S /bin/bash #$ -pe mpich2 min-Max export MPICH2_ROOT=/opt/mpich2/gnu export PATH=$MPICH2_ROOT/bin:$PATH export MPD_CON_EXT="sge_$JOB_ID.$SGE_TASK_ID" /opt/mpich2/gnu/bin/mpiexec -machinefile $TMPDIR/machines -n $NSLOTS /path/to/program exit 0