Diplopodus - How to run mpi
All jobs on the cluster should be started through the queue system PBS. This only
applies to jobs you want to run on the nodes, not compilation etc. which are
done on the server.
mpirun
If you try to start a job on the cluster, using mpirun, you get an error
message. The prpoer way of starting mpi-jobs, are by using the PBS-command
qsub. The easiest way of using this, is by creating a script, where
resource-settings for PBS are embedded. Example:
#!/bin/bash
# Which queue
#PBS -q tpv
# This many nodes
#PBS -l nodes=4
# The output goes to
###PBS -o pbs.out
# Std. out and Std. err into same file.
###PBS -j oe
#PBS
# cd to working directory
cd $PBS_O_WORKDIR
# Add /usr/local/bin to the PATH
PATH=$PATH:/usr/local/bin
# run the command.
pmpirun ./rd2D 100 100 2000 2
Put this into a file pbs.sh, and run qsub pbs.sh
Explanation
The #PBS command are use for embedding PBS resource settings in the
script.
- #PBS -q tpv indicates that this job should be run in tpv-queue.
This is the default queue, and for now the only queue, hence this setting is
not required.
- #PBS -l nodes=4 request four nodes, one cpu on each node. If you
want more than one cpu on each node, you can use #PBS -l nodes=4:ppn=2
which gives you eight nodes in total.
- #PBS -o name set the output file to name
- #PBS -j oe put standard out and standard error into the same file.
- If you put more than one hash infront of PBS, the line turns into a
standard comment
- $PBS_O_WORKDIR is the current directory when qsub is
called. Change into this directoryif you don't want to give the full pathname
for your executable.
- The command mpirun should not be used directly, since this command
not recognize which cpus PBS has allocated for you. Instead, you should use
/usr/local/bin/pmpirun, which automagically computes the right number of nodes
from the #PBS nodes setting.
|