Running an Ensemble

libEnsemble features two approaches to run an ensemble. We recommend the newer Class approach, but will continue to support the classic libE() approach for backward compatibility.

class libensemble.ensemble.Ensemble

The primary class for a libEnsemble workflow.

Example
 1from libensemble import Ensemble, SimSpecs, GenSpecs, ExitCriteria
 2from my_simulator import beamline
 3from someones_optimizer import optimize
 4
 5experiment = Ensemble()
 6experiment.sim_specs = SimSpecs(sim_f=beamline, inputs=["x"], out=[("f", float)])
 7experiment.gen_specs = GenSpecs(
 8    gen_f=optimize,
 9    inputs=["f"],
10    out=[("x", float, (1,))],
11    user={
12        "lb": np.array([-3]),
13        "ub": np.array([3]),
14    },
15)
16
17experiment.exit_criteria = ExitCriteria(gen_max=101)
18results = experiment.run()

Parses --comms, --nworkers, and other options from the command-line, validates inputs, configures logging, and performs other preparations.

Call .run() on the class to start the workflow.

Configure by:

Option 1: Providing parameters on instantiation
 1from libensemble import Ensemble
 2from my_simulator import sim_find_energy
 3
 4sim_specs = {
 5    "sim_f": sim_find_energy,
 6    "in": ["x"],
 7    "out": [("y", float)],
 8}
 9
10experiment = Ensemble(sim_specs=sim_specs)
Option 2: Assigning parameters to an instance
 1from libensemble import Ensemble, SimSpecs
 2from my_simulator import sim_find_energy
 3
 4sim_specs = SimSpecs(
 5    sim_f=sim_find_energy,
 6    inputs=["x"],
 7    out=[("y", float)],
 8)
 9
10experiment = Ensemble()
11experiment.sim_specs = sim_specs
Option 3: Loading parameters from files
1from libensemble import Ensemble
2
3experiment = Ensemble()
4
5my_experiment.from_yaml("my_parameters.yaml")
6# or...
7my_experiment.from_toml("my_parameters.toml")
8# or...
9my_experiment.from_json("my_parameters.json")
 1libE_specs:
 2    save_every_k_gens: 20
 3
 4exit_criteria:
 5    sim_max: 80
 6
 7gen_specs:
 8    gen_f: generator.gen_random_sample
 9    outputs:
10        x:
11            type: float
12            size: 1
13    user:
14        gen_batch_size: 5
15
16sim_specs:
17    sim_f: simulator.sim_find_sine
18    inputs:
19        - x
20    outputs:
21        y:
22            type: float
 1[libE_specs]
 2    save_every_k_gens = 300
 3
 4[exit_criteria]
 5    sim_max = 80
 6
 7[gen_specs]
 8    gen_f = "generator.gen_random_sample"
 9    [gen_specs.out]
10        [gen_specs.out.x]
11            type = "float"
12            size = 1
13    [gen_specs.user]
14        gen_batch_size = 5
15
16[sim_specs]
17    sim_f = "simulator.sim_find_sine"
18    inputs = ["x"]
19    [sim_specs.out]
20        [sim_specs.out.y]
21            type = "float"
 1{
 2    "libE_specs": {
 3        "save_every_k_gens": 300,
 4    },
 5    "exit_criteria": {
 6        "sim_max": 80
 7    },
 8    "gen_specs": {
 9        "gen_f": "generator.gen_random_sample",
10        "out": {
11            "x": {
12                "type": "float",
13                "size": 1
14            }
15        },
16        "user": {
17            "gen_batch_size": 5
18        }
19    },
20    "sim_specs": {
21        "sim_f": "simulator.sim_find_sine",
22        "inputs": ["x"],
23        "out": {
24            "f": {"type": "float"}
25        }
26    }
27}

After calling .run(), the final states of H, persis_info, and a flag are made available.

Parameters:
  • sim_specs (dict or SimSpecs) – Specifications for the simulation function

  • gen_specs (dict or GenSpecs, optional) – Specifications for the generator function

  • exit_criteria (dict or ExitCriteria, optional) – Tell libEnsemble when to stop a run

  • persis_info (dict, optional) – Persistent information to be passed between user functions (example)

  • alloc_specs (dict or AllocSpecs, optional) – Specifications for the allocation function

  • libE_specs (dict or LibeSpecs, optional) – Specifications for libEnsemble

  • H0 (NumPy structured array, optional) – A libEnsemble history to be prepended to this run’s history (example)

ready()

Quickly verify that all necessary data has been provided

Return type:

bool

run()

Initializes libEnsemble.

MPI/comms Notes

Manager-worker intercommunications are parsed from the comms key of libE_specs. An MPI runtime is assumed by default if --comms local wasn’t specified on the command-line or in libE_specs.

If a MPI communicator was provided in libE_specs, then each .run() call will initiate intercommunications on a duplicate of that communicator. Otherwise, a duplicate of COMM_WORLD will be used.

Returns:

  • H (NumPy structured array) – History array storing rows for each point. (example)

  • persis_info (dict) – Final state of persistent information (example)

  • exit_flag (int) – Flag containing final task status

    0 = No errors
    1 = Exception occurred
    2 = Manager timed out and ended simulation
    3 = Current process is not in libEnsemble MPI communicator
    

Return type:

(numpy.ndarray[Any, numpy.dtype[+ScalarType]], <class ‘dict’>, <class ‘int’>)

from_yaml(file_path)

Parameterizes libEnsemble from yaml file

Parameters:

file_path (str) –

from_toml(file_path)

Parameterizes libEnsemble from toml file

Parameters:

file_path (str) –

from_json(file_path)

Parameterizes libEnsemble from json file

Parameters:

file_path (str) –

add_random_streams(num_streams=0, seed='')

Adds np.random generators for each worker to persis_info

Parameters:
  • num_streams (int) –

  • seed (str) –

save_output(file)

Writes out History array and persis_info to files. If using a workflow_dir, will place with specified filename in that directory

Format: <calling_script>_results_History_length=<length>_evals=<Completed evals>_ranks=<nworkers>

Parameters:

file (str) –

The libE module is the outer libEnsemble routine.

This module sets up the manager and the team of workers, configured according to the contents of libE_specs. The manager/worker communications scheme used in libEnsemble is parsed from the comms key if present, with valid values being mpi, local (for multiprocessing), or tcp. MPI is the default; if a communicator is specified, each call to this module will initiate manager/worker communications on a duplicate of that communicator. Otherwise, a duplicate of COMM_WORLD will be used.

In the vast majority of cases, programming with libEnsemble involves the creation of a calling script, a Python file where libEnsemble is parameterized via the various specification dictionaries (e.g. libE_specs, sim_specs, and gen_specs). The outer libEnsemble routine libE() is imported and called with such dictionaries to initiate libEnsemble. A simple calling script (from the first tutorial) may resemble:

 1import numpy as np
 2from libensemble.libE import libE
 3from generator import gen_random_sample
 4from simulator import sim_find_sine
 5from libensemble.tools import add_unique_random_streams
 6
 7nworkers, is_manager, libE_specs, _ = parse_args()
 8
 9libE_specs["save_every_k_gens"] = 20
10
11gen_specs = {
12    "gen_f": gen_random_sample,
13    "out": [("x", float, (1,))],
14    "user": {"lower": np.array([-3]), "upper": np.array([3]), "gen_batch_size": 5},
15}
16
17sim_specs = {"sim_f": sim_find_sine, "in": ["x"], "out": [("y", float)]}
18
19persis_info = add_unique_random_streams({}, nworkers + 1)
20
21exit_criteria = {"sim_max": 80}
22
23H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, libE_specs=libE_specs)

This will initiate libEnsemble with a Manager and nworkers workers (parsed from the command line), and runs on laptops or supercomputers. If an exception is encountered by the manager or workers, the history array is dumped to file, and MPI abort is called.

On macOS (since Python 3.8) and Windows, the default multiprocessing start method is "spawn" and you must place most calling script code (or just libE() / Ensemble().run() at a minimum) in an if __name__ == "__main__:" block.

Therefore a calling script that is universal across all platforms and comms-types may resemble:

 1import numpy as np
 2from libensemble.libE import libE
 3from generator import gen_random_sample
 4from simulator import sim_find_sine
 5from libensemble.tools import add_unique_random_streams
 6
 7if __name__ == "__main__":
 8
 9    nworkers, is_manager, libE_specs, _ = parse_args()
10
11    libE_specs["save_every_k_gens"] = 20
12
13    gen_specs = {
14        "gen_f": gen_random_sample,
15        "out": [("x", float, (1,))],
16        "user": {
17            "lower": np.array([-3]),
18            "upper": np.array([3]),
19            "gen_batch_size": 5,
20        },
21    }
22
23    sim_specs = {
24        "sim_f": sim_find_sine,
25        "in": ["x"],
26        "out": [("y", float)],
27    }
28
29    persis_info = add_unique_random_streams({}, nworkers + 1)
30
31    exit_criteria = {"sim_max": 80}
32
33    H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, libE_specs=libE_specs)

Alternatively, you may set the multiprocessing start method to "fork" via the following:

1from multiprocessing import set_start_method
2
3set_start_method("fork")

But note that this is incompatible with some libraries.

See below for the complete traditional API.

libensemble.libE.libE(sim_specs, gen_specs, exit_criteria, persis_info={}, alloc_specs=AllocSpecs(alloc_f=<function give_sim_work_first>, user={'num_active_gens': 1}, out=[]), libE_specs={}, H0=None)
Parameters:
  • sim_specs (dict or SimSpecs) – Specifications for the simulation function (example)

  • gen_specs (dict or GenSpecs, optional) – Specifications for the generator function (example)

  • exit_criteria (dict or ExitCriteria, optional) – Tell libEnsemble when to stop a run (example)

  • persis_info (dict, optional) – Persistent information to be passed between user functions (example)

  • alloc_specs (dict or AllocSpecs, optional) – Specifications for the allocation function (example)

  • libE_specs (dict or LibeSpecs, optional) – Specifications for libEnsemble (example)

  • H0 – A libEnsemble history to be prepended to this run’s history (example)

Returns:

  • H (NumPy structured array) – History array storing rows for each point. (example)

  • persis_info (dict) – Final state of persistent information (example)

  • exit_flag (int) – Flag containing final task status

    0 = No errors
    1 = Exception occurred
    2 = Manager timed out and ended simulation
    3 = Current process is not in libEnsemble MPI communicator
    

Return type:

(<class ‘numpy.ndarray’>, Dict, <class ‘int’>)