Skip to content
Diego Melgar edited this page Jun 23, 2015 · 3 revisions

Right now MudPy supports parallel computing for Green's function and synthetics calculation only. This is done through mpi4py and it significantly reduces compute times.

The solve step (the actual inversion) is not parallelized yet. I'm working right now on integrating PETSc support to make that step faster but that's a medium term project. Stay tuned.

Install mpi4py

This step was a bit cumbersome for me. First try the easy way:

pip install mpi4py

If this works you are good to go. For me it didn't, should this be the case for you too then you will need to build from the source. First you need to install openMPI. Once this is done you can build mpi4py, the instructions are here.

Discover how many CPU's you have

On mac OS type the following at the terminal to determine how many CPUs you have available

sysctl -n hw.ncpu

On linux type

nproc --all

This number will be the maximum number of processors you can request computation on. For my laptop this is 8.

Modifying the .inv parameter files

If you already have a .inv parameter file you have been using for computation insert the line ncpus=8 (with however many CPUs your system has) close to the top such that your file looks something like

G_from_file=0 # =0 read GFs and create a new G, =1 load G from file
invert=0  # =1 runs inversion, =0 does nothing
###############################################################################

###############           Green function parameters               #############
ncpus=8   #Number of available cores (set to ncpus=1 for serial)
hot_start=0  #Start at a certain subfault number

In newer versions of mudpy this argument will be required otherwise an error will be thrown, so set this to at least ncpus=1 for serial computation.

You will also need to modify the function call at the end to include the ncpus argument like so

# Run green functions          
if make_green==1 or make_synthetics==1:
    runslip.inversionGFs(home,project_name,GF_list,tgf_file,fault_name,model_name,
        dt,tsun_dt,NFFT,tsunNFFT,make_green,make_synthetics,dk,pmin,
        pmax,kmax,beta,time_epi,hot_start,ncpus) 

Things to note

This makes things much, much faster, especially for large GF/synthetics runs like InSAR and seafloor deformation points.

Parallelization happens over the number of source points. The large source is broken up into smaller ones and each smaller source is distributed to one CPU. You can see how the source is broken up by looking in /data/model_info/ you should see files labeled like mpi_source.0.fault,mpi_source.1.fault, etc.

mpi4py is very robust so if you request GFs/synthetics for many different kinds of data and an error occurs the code will output some error messages and continue onto the next task, this is good, but always check console output to see if everything ran successfully.

The output has changed from what you might be used to. It's untidy to have a bunch of cpus all printing to screen so everything has been streamlined.