5. Next steps¶
Introduction || 1. Getting started || 2. Generator || 3. Simulator || 4. Script || 5. Next steps
libEnsemble with MPI
MPI is a standard interface for parallel computing, implemented in libraries such as MPICH and used at extreme scales. MPI potentially allows libEnsemble’s processes to be distributed over multiple nodes and works in some circumstances where Python’s multiprocessing does not. In this section, we’ll explore modifying the above code to use MPI instead of multiprocessing.
We recommend the MPI distribution MPICH for this tutorial, which can be found
for a variety of systems here. You also need mpi4py, which can be installed
with pip install mpi4py. If you’d like to use a specific version or
distribution of MPI instead of MPICH, configure mpi4py with that MPI at
installation with MPICC=<path/to/MPI_C_compiler> pip install mpi4py If this
doesn’t work, try appending --user to the end of the command. See the
mpi4py docs for more information.
Verify that MPI has been installed correctly with mpirun --version.
Modifying the script
Only a few changes are necessary to make our code MPI-compatible. For starters,
comment out the libE_specs definition:
# libE_specs = LibeSpecs(nworkers=4, comms="local")
We’ll be parameterizing our MPI runtime with a parse_args=True argument to
the Ensemble class instead of libE_specs. We’ll also use an ensemble.is_manager
attribute so only the first MPI rank runs the data-processing code.
The bottom of your calling script should now resemble:
28 # replace libE_specs with parse_args=True. Detects MPI runtime
29 ensemble = Ensemble(sim_specs, gen_specs, exit_criteria, alloc_specs=alloc_specs, parse_args=True)
30
31 ensemble.run() # start the ensemble. Blocks until completion.
32
33 if ensemble.is_manager: # only True on rank 0
34 history = ensemble.H # start visualizing our results
35 print([i for i in history.dtype.fields])
36 print(history)
37
38 import matplotlib.pyplot as plt
39
40 colors = ["b", "g", "r", "y", "m", "c", "k", "w"]
41
42 for i in range(1, ensemble.nworkers + 1):
43 worker_xy = np.extract(history["sim_worker"] == i, history)
44 x = [entry.tolist()[0] for entry in worker_xy["x"]]
45 y = [entry for entry in worker_xy["y"]]
46 plt.scatter(x, y, label="Worker {}".format(i), c=colors[i - 1])
47
48 plt.title("Sine calculations for a uniformly sampled random distribution")
49 plt.xlabel("x")
50 plt.ylabel("sine(x)")
51 plt.legend(loc="lower right")
52 plt.savefig("tutorial_sines.png")
With these changes in place, our libEnsemble code can be run with MPI by
mpirun -n 5 python calling.py
where -n 5 tells mpirun to produce five processes, one of which will be
the libEnsemble manager process and the others will run libEnsemble workers.
This tutorial is only a tiny demonstration of the parallelism capabilities of libEnsemble. libEnsemble has been developed primarily to support research on High-Performance computers, with potentially hundreds of workers performing calculations simultaneously. Please read our platform guides for introductions to using libEnsemble on many such machines.
libEnsemble’s Executors can launch non-Python user applications and simulations across allocated compute resources. Try out this feature with a more-complicated libEnsemble use-case within our Electrostatic Forces tutorial.