libEnsemble runs using a Manager/Worker paradigm. In most cases, one manager and multiples workers. Each worker may run either a generator or simulator function (both are Python scripts). Generators determine the parameters/inputs for simulations. Simulator functions run simulations, which often involve running a user application from the Worker (see Executor).
To use libEnsemble, you will need a calling script, which in turn will specify generator and simulator functions. Many examples are available.
There are currently three communication options for libEnsemble (determining how the Manager
and Workers communicate). These are
tcp. The default is
You do not need the
mpi communication mode to use the
MPI Executor to launch user MPI applications from workers.
The communications modes described here only refer to how the libEnsemble manager and
This option uses
mpi4py for the Manager/Worker communication. It is used automatically if
you run your libEnsemble calling script with an MPI runner. E.g:
mpirun -np N python myscript.py
N is the number of processors. This will launch one manager and
This option requires
mpi4py to be installed to interface with the MPI on your system.
It works on a standalone system, and with both
central and distributed modes of running libEnsemble on
It also potentially scales the best when running with many workers on HPC systems.
Limitations of MPI mode¶
If you are launching MPI applications from workers, then MPI is being nested. This is not supported with Open MPI. This can be overcome by using a proxy launcher (see Balsam). This nesting does work, however, with MPICH and its derivative MPI implementations.
It is also unsuitable to use this mode when running on the launch nodes of three-tier
systems (e.g. Theta/Summit). In that case
local mode is recommended.
This option uses Python’s built-in multiprocessing module for the manager/worker communications.
local and number of workers
nworkers may be provided in the
libE_specs dictionary. Your calling script can then be run:
Alternatively, if your calling script uses the parse_args() function you can specify these on the command line:
python myscript.py --comms local --nworkers N
N is the number of workers. This will launch one manager and
libEnsemble will run on one node in this scenario. It is only suitable for running in central mode on multi-node systems. It can also be used on stand-alone systems. Technically, you could run without central_mode set, but libEnsemble will still run on one node.
In particular, this mode can be used to run on the launch nodes of three-tier
systems (e.g. Theta/Summit), allowing the whole node allocation for
worker-launched application runs. In this scenario, make sure there are
no imports of
mpi4py in your Python scripts.
The TCP option can be used to run the Manager on one system and launch workers to remote
systems or nodes over TCP. The necessary configuration options can be provided through
libE_specs, or on the command line if you are using the parse_args() function.
libE_specs options for TCP are:
'comms' [string]: 'tcp' 'nworkers' [int]: Number of worker processes to spawn 'workers' list: A list of worker hostnames. 'ip' [String]: IP address 'port' [int]: Port number. 'authkey' [String]: Authkey.
Limitations of TCP mode¶
There cannot be two calls to
libEin the same script.
In a regular (non-persistent) worker, the user’s generator or simulation function is called whenever the worker receives work. A persistent worker is one that continues to run the generator or simulation function between work units, maintaining the local data environment.
A common use-case consists of a persistent generator (such as persistent_aposmm) that maintains optimization data, while generating new simulation inputs. The persistent generator runs on a dedicated worker while in persistent mode. This requires an appropriate allocation function that will run the generator as persistent.
When running with a persistent generator, it is important to remember that a worker will be dedicated to the generator and cannot run simulations. For example, the following run:
mpirun -np 3 python my_script.py
would run one manager process, one worker with a persistent generator, and one worker running simulations.
If this example was run as:
mpirun -np 2 python my_script.py
No simulations will be able to run.
Environment variables required in your run environment can be set in your Python sim or gen function. For example:
os.environ["OMP_NUM_THREADS"] = 4
set in your simulation script before the Executor submit command will export the setting to your run.