Running MPI code on the Royal Society machines

All Royal Society machines have an installation of the MPI implementation MPICH2. You can use this to run programs that use MPI on several cores of one machine or on all the machines in the cluster.

Getting started

This process has been greatly simplified in the current versions of MPICH2. Just create a file containing the names of the machines you want to use and the number of cores you want to use on each, like this:
cat > mpd.hosts
hydra:8
vulpecula:4
taurus:4
centaurus:4
coma:4
fornax:4
^D
Here the number after the colon gives the number of processors on each machine.

You will only need to modify mpd.hosts (or make a different file) if you want to control which machines your tasks run on.

Running MPI jobs

If you like, you can now run a test command:
mpirun -f mpd.hosts hostname
This runs the command 'hostname' on each of the machines specified in the mpd.hosts file. You should see that one version of the command is run per CPU.

Running code

You are now ready to run real MPI code.
mpirun -f mpd.hosts my-mcmc-code

One thing to bear in mind with the Royal Society machines is that they are not a uniform cluster. hydra has 8 CPUs (cores), while taurus, pictor, vulpecula, coma and fornax have 4. The speeds of the machines are also not identical: pictor and hydra are slightly slower than everything else.

Also bear in mind use of the nice command (or system call) in your MPI code to make sure that you don't screw up other people's interactive response. Don't forget that taurus, vulpecula and pictor are all desktop machines. If you need to run long, CPU-intensive jobs, consider using the Starlink cluster instead.