5. Next steps¶

Introduction || 1. Getting started || 2. Generator || 3. Simulator || 4. Script || 5. Next steps

libEnsemble with MPI

MPI is a standard interface for parallel computing, implemented in libraries such as MPICH and used at extreme scales. MPI potentially allows libEnsemble’s processes to be distributed over multiple nodes and works in some circumstances where Python’s multiprocessing does not. In this section, we’ll explore modifying the above code to use MPI instead of multiprocessing.

We recommend the MPI distribution MPICH for this tutorial, which can be found for a variety of systems here. You also need mpi4py, which can be installed with pip install mpi4py. If you’d like to use a specific version or distribution of MPI instead of MPICH, configure mpi4py with that MPI at installation with MPICC=<path/to/MPI_C_compiler> pip install mpi4py If this doesn’t work, try appending --user to the end of the command. See the mpi4py docs for more information.

Verify that MPI has been installed correctly with mpirun --version.

Modifying the script

Only a few changes are necessary to make our code MPI-compatible. For starters, comment out the libE_specs definition:

    # libE_specs = LibeSpecs(nworkers=4, comms="local")

We’ll be parameterizing our MPI runtime with a parse_args=True argument to the Ensemble class instead of libE_specs. We’ll also use an ensemble.is_manager attribute so only the first MPI rank runs the data-processing code.

The bottom of your calling script should now resemble:

    # replace libE_specs with parse_args=True. Detects MPI runtime
    ensemble = Ensemble(sim_specs, gen_specs, exit_criteria, alloc_specs=alloc_specs, parse_args=True)

    ensemble.run()  # start the ensemble. Blocks until completion.

    if ensemble.is_manager:  # only True on rank 0
        history = ensemble.H  # start visualizing our results
        print([i for i in history.dtype.fields])
        print(history)

        import matplotlib.pyplot as plt

        colors = ["b", "g", "r", "y", "m", "c", "k", "w"]

        for i in range(1, ensemble.nworkers + 1):
            worker_xy = np.extract(history["sim_worker"] == i, history)
            x = [entry.tolist()[0] for entry in worker_xy["x"]]
            y = [entry for entry in worker_xy["y"]]
            plt.scatter(x, y, label="Worker {}".format(i), c=colors[i - 1])

        plt.title("Sine calculations for a uniformly sampled random distribution")
        plt.xlabel("x")
        plt.ylabel("sine(x)")
        plt.legend(loc="lower right")
        plt.savefig("tutorial_sines.png")

With these changes in place, our libEnsemble code can be run with MPI by

mpirun -n 5 python calling.py

where -n 5 tells mpirun to produce five processes, one of which will be the libEnsemble manager process and the others will run libEnsemble workers.

This tutorial is only a tiny demonstration of the parallelism capabilities of libEnsemble. libEnsemble has been developed primarily to support research on High-Performance computers, with potentially hundreds of workers performing calculations simultaneously. Please read our platform guides for introductions to using libEnsemble on many such machines.

libEnsemble’s Executors can launch non-Python user applications and simulations across allocated compute resources. Try out this feature with a more-complicated libEnsemble use-case within our Electrostatic Forces tutorial.