Running an Ensemble
libEnsemble features two approaches to run an ensemble. We recommend the newer Class
approach, but will continue to support the classic libE()
approach for backward
compatibility.
- class libensemble.ensemble.Ensemble
The primary class for a libEnsemble workflow.
Example
1from libensemble import Ensemble, SimSpecs, GenSpecs, ExitCriteria 2from my_simulator import beamline 3from someones_optimizer import optimize 4 5experiment = Ensemble() 6experiment.sim_specs = SimSpecs(sim_f=beamline, inputs=["x"], out=[("f", float)]) 7experiment.gen_specs = GenSpecs( 8 gen_f=optimize, 9 inputs=["f"], 10 out=[("x", float, (1,))], 11 user={ 12 "lb": np.array([-3]), 13 "ub": np.array([3]), 14 }, 15) 16 17experiment.exit_criteria = ExitCriteria(gen_max=101) 18results = experiment.run()
Parses
--comms
,--nworkers
, and other options from the command-line, validates inputs, configures logging, and performs other preparations.Call
.run()
on the class to start the workflow.Configure by:
Option 1: Providing parameters on instantiation
1from libensemble import Ensemble 2from my_simulator import sim_find_energy 3 4sim_specs = { 5 "sim_f": sim_find_energy, 6 "in": ["x"], 7 "out": [("y", float)], 8} 9 10experiment = Ensemble(sim_specs=sim_specs)
Option 2: Assigning parameters to an instance
1from libensemble import Ensemble, SimSpecs 2from my_simulator import sim_find_energy 3 4sim_specs = SimSpecs( 5 sim_f=sim_find_energy, 6 inputs=["x"], 7 out=[("y", float)], 8) 9 10experiment = Ensemble() 11experiment.sim_specs = sim_specs
Option 3: Loading parameters from files
1from libensemble import Ensemble 2 3experiment = Ensemble() 4 5my_experiment.from_yaml("my_parameters.yaml") 6# or... 7my_experiment.from_toml("my_parameters.toml") 8# or... 9my_experiment.from_json("my_parameters.json")
1libE_specs: 2 save_every_k_gens: 20 3 4exit_criteria: 5 sim_max: 80 6 7gen_specs: 8 gen_f: generator.gen_random_sample 9 outputs: 10 x: 11 type: float 12 size: 1 13 user: 14 gen_batch_size: 5 15 16sim_specs: 17 sim_f: simulator.sim_find_sine 18 inputs: 19 - x 20 outputs: 21 y: 22 type: float
1[libE_specs] 2 save_every_k_gens = 300 3 4[exit_criteria] 5 sim_max = 80 6 7[gen_specs] 8 gen_f = "generator.gen_random_sample" 9 [gen_specs.out] 10 [gen_specs.out.x] 11 type = "float" 12 size = 1 13 [gen_specs.user] 14 gen_batch_size = 5 15 16[sim_specs] 17 sim_f = "simulator.sim_find_sine" 18 inputs = ["x"] 19 [sim_specs.out] 20 [sim_specs.out.y] 21 type = "float"
1{ 2 "libE_specs": { 3 "save_every_k_gens": 300, 4 }, 5 "exit_criteria": { 6 "sim_max": 80 7 }, 8 "gen_specs": { 9 "gen_f": "generator.gen_random_sample", 10 "out": { 11 "x": { 12 "type": "float", 13 "size": 1 14 } 15 }, 16 "user": { 17 "gen_batch_size": 5 18 } 19 }, 20 "sim_specs": { 21 "sim_f": "simulator.sim_find_sine", 22 "inputs": ["x"], 23 "out": { 24 "f": {"type": "float"} 25 } 26 } 27}
After calling
.run()
, the final states ofH
,persis_info
, and a flag are made available.- Parameters:
sim_specs (
dict
orSimSpecs
) – Specifications for the simulation functiongen_specs (
dict
orGenSpecs
, optional) – Specifications for the generator functionexit_criteria (
dict
orExitCriteria
, optional) – Tell libEnsemble when to stop a runpersis_info (
dict
, optional) – Persistent information to be passed between user functions (example)alloc_specs (
dict
orAllocSpecs
, optional) – Specifications for the allocation functionlibE_specs (
dict
orLibeSpecs
, optional) – Specifications for libEnsembleH0 (NumPy structured array, optional) – A libEnsemble history to be prepended to this run’s history (example)
- ready()
Quickly verify that all necessary data has been provided
- Return type:
bool
- run()
Initializes libEnsemble.
MPI/comms Notes
Manager-worker intercommunications are parsed from the
comms
key of libE_specs. An MPI runtime is assumed by default if--comms local
wasn’t specified on the command-line or inlibE_specs
.If a MPI communicator was provided in
libE_specs
, then each.run()
call will initiate intercommunications on a duplicate of that communicator. Otherwise, a duplicate ofCOMM_WORLD
will be used.- Returns:
H (NumPy structured array) – History array storing rows for each point. (example)
persis_info (
dict
) – Final state of persistent information (example)exit_flag (
int
) – Flag containing final task status0 = No errors 1 = Exception occurred 2 = Manager timed out and ended simulation 3 = Current process is not in libEnsemble MPI communicator
- Return type:
(numpy.ndarray[Any, numpy.dtype[+ScalarType]], <class ‘dict’>, <class ‘int’>)
- from_yaml(file_path)
Parameterizes libEnsemble from
yaml
file- Parameters:
file_path (str) –
- from_toml(file_path)
Parameterizes libEnsemble from
toml
file- Parameters:
file_path (str) –
- from_json(file_path)
Parameterizes libEnsemble from
json
file- Parameters:
file_path (str) –
- add_random_streams(num_streams=0, seed='')
Adds
np.random
generators for each worker topersis_info
- Parameters:
num_streams (int) –
seed (str) –
- save_output(file)
Writes out History array and persis_info to files. If using a workflow_dir, will place with specified filename in that directory
Format:
<calling_script>_results_History_length=<length>_evals=<Completed evals>_ranks=<nworkers>
- Parameters:
file (str) –
The libE module is the outer libEnsemble routine.
This module sets up the manager and the team of workers, configured according
to the contents of libE_specs. The manager/worker
communications scheme used in libEnsemble is parsed from the comms
key
if present, with valid values being mpi
, local
(for multiprocessing), or
tcp
. MPI is the default; if a communicator is specified, each call to this
module will initiate manager/worker communications on a duplicate of that
communicator. Otherwise, a duplicate of COMM_WORLD
will be used.
In the vast majority of cases, programming with libEnsemble involves the creation
of a calling script, a Python file where libEnsemble is parameterized via
the various specification dictionaries (e.g. libE_specs,
sim_specs, and gen_specs). The
outer libEnsemble routine libE()
is imported and called with such
dictionaries to initiate libEnsemble. A simple calling script
(from the first tutorial) may resemble:
1import numpy as np
2from libensemble.libE import libE
3from generator import gen_random_sample
4from simulator import sim_find_sine
5from libensemble.tools import add_unique_random_streams
6
7nworkers, is_manager, libE_specs, _ = parse_args()
8
9libE_specs["save_every_k_gens"] = 20
10
11gen_specs = {
12 "gen_f": gen_random_sample,
13 "out": [("x", float, (1,))],
14 "user": {"lower": np.array([-3]), "upper": np.array([3]), "gen_batch_size": 5},
15}
16
17sim_specs = {"sim_f": sim_find_sine, "in": ["x"], "out": [("y", float)]}
18
19persis_info = add_unique_random_streams({}, nworkers + 1)
20
21exit_criteria = {"sim_max": 80}
22
23H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, libE_specs=libE_specs)
This will initiate libEnsemble with a Manager and nworkers
workers (parsed from
the command line), and runs on laptops or supercomputers. If an exception is
encountered by the manager or workers, the history array is dumped to file, and
MPI abort is called.
On macOS (since Python 3.8) and Windows, the default multiprocessing start method is "spawn"
and you must place most calling script code (or just libE()
/ Ensemble().run()
at a minimum) in
an if __name__ == "__main__:"
block.
Therefore a calling script that is universal across all platforms and comms-types may resemble:
1import numpy as np
2from libensemble.libE import libE
3from generator import gen_random_sample
4from simulator import sim_find_sine
5from libensemble.tools import add_unique_random_streams
6
7if __name__ == "__main__":
8
9 nworkers, is_manager, libE_specs, _ = parse_args()
10
11 libE_specs["save_every_k_gens"] = 20
12
13 gen_specs = {
14 "gen_f": gen_random_sample,
15 "out": [("x", float, (1,))],
16 "user": {
17 "lower": np.array([-3]),
18 "upper": np.array([3]),
19 "gen_batch_size": 5,
20 },
21 }
22
23 sim_specs = {
24 "sim_f": sim_find_sine,
25 "in": ["x"],
26 "out": [("y", float)],
27 }
28
29 persis_info = add_unique_random_streams({}, nworkers + 1)
30
31 exit_criteria = {"sim_max": 80}
32
33 H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, libE_specs=libE_specs)
Alternatively, you may set the multiprocessing start method to "fork"
via the following:
1from multiprocessing import set_start_method
2
3set_start_method("fork")
But note that this is incompatible with some libraries.
See below for the complete traditional API.
- libensemble.libE.libE(sim_specs, gen_specs, exit_criteria, persis_info={}, alloc_specs=AllocSpecs(alloc_f=<function give_sim_work_first>, user={'num_active_gens': 1}, out=[]), libE_specs={}, H0=None)
- Parameters:
sim_specs (
dict
orSimSpecs
) – Specifications for the simulation function (example)gen_specs (
dict
orGenSpecs
, optional) – Specifications for the generator function (example)exit_criteria (
dict
orExitCriteria
, optional) – Tell libEnsemble when to stop a run (example)persis_info (
dict
, optional) – Persistent information to be passed between user functions (example)alloc_specs (
dict
orAllocSpecs
, optional) – Specifications for the allocation function (example)libE_specs (
dict
orLibeSpecs
, optional) – Specifications for libEnsemble (example)H0 – A libEnsemble history to be prepended to this run’s history (example)
- Returns:
H (NumPy structured array) – History array storing rows for each point. (example)
persis_info (
dict
) – Final state of persistent information (example)exit_flag (
int
) – Flag containing final task status0 = No errors 1 = Exception occurred 2 = Manager timed out and ended simulation 3 = Current process is not in libEnsemble MPI communicator
- Return type:
(<class ‘numpy.ndarray’>, Dict, <class ‘int’>)