Running an Ensemble

libEnsemble features two approaches to run an ensemble. We recommend the newer Ensemble class, but will continue to support libE() for backward compatibility.

class libensemble.ensemble.Ensemble

The primary object for a libEnsemble workflow. Parses and validates settings, sets up logging, and maintains output.

Example
 1import numpy as np
 2
 3from libensemble import Ensemble
 4from libensemble.gen_funcs.sampling import latin_hypercube_sample
 5from libensemble.sim_funcs.simple_sim import norm_eval
 6from libensemble.specs import ExitCriteria, GenSpecs, LibeSpecs, SimSpecs
 7
 8libE_specs = LibeSpecs(nworkers=4)
 9sampling = Ensemble(libE_specs=libE_specs)
10sampling.sim_specs = SimSpecs(
11    sim_f=norm_eval,
12    inputs=["x"],
13    outputs=[("f", float)],
14)
15sampling.gen_specs = GenSpecs(
16    gen_f=latin_hypercube_sample,
17    outputs=[("x", float, (1,))],
18    user={
19        "gen_batch_size": 50,
20        "lb": np.array([-3]),
21        "ub": np.array([3]),
22    },
23)
24
25sampling.add_random_streams()
26sampling.exit_criteria = ExitCriteria(sim_max=100)
27
28if __name__ == "__main__":
29    sampling.run()
30    sampling.save_output(__file__)

Run the above example via python this_file.py.

Instead of using the libE_specs line, you can also use sampling = Ensemble(parse_args=True) and run via python this_file.py -n 4 (4 workers). The parse_args=True parameter instructs the Ensemble class to read command-line arguments.

Configure by:

Option 1: Providing parameters on instantiation
 1from libensemble import Ensemble
 2from my_simulator import sim_find_energy
 3
 4sim_specs = {
 5    "sim_f": sim_find_energy,
 6    "in": ["x"],
 7    "out": [("y", float)],
 8}
 9
10experiment = Ensemble(sim_specs=sim_specs)
Option 2: Assigning parameters to an instance
 1from libensemble import Ensemble, SimSpecs
 2from my_simulator import sim_find_energy
 3
 4sim_specs = SimSpecs(
 5    sim_f=sim_find_energy,
 6    inputs=["x"],
 7    outputs=[("y", float)],
 8)
 9
10experiment = Ensemble()
11experiment.sim_specs = sim_specs
Option 3: Loading parameters from files
1from libensemble import Ensemble
2
3experiment = Ensemble()
4
5my_experiment.from_yaml("my_parameters.yaml")
6# or...
7my_experiment.from_toml("my_parameters.toml")
8# or...
9my_experiment.from_json("my_parameters.json")
 1libE_specs:
 2    save_every_k_gens: 20
 3
 4exit_criteria:
 5    sim_max: 80
 6
 7gen_specs:
 8    gen_f: generator.gen_random_sample
 9    outputs:
10        x:
11            type: float
12            size: 1
13    user:
14        gen_batch_size: 5
15
16sim_specs:
17    sim_f: simulator.sim_find_sine
18    inputs:
19        - x
20    outputs:
21        y:
22            type: float
 1[libE_specs]
 2    save_every_k_gens = 300
 3
 4[exit_criteria]
 5    sim_max = 80
 6
 7[gen_specs]
 8    gen_f = "generator.gen_random_sample"
 9    [gen_specs.outputs]
10        [gen_specs.outputs.x]
11            type = "float"
12            size = 1
13    [gen_specs.user]
14        gen_batch_size = 5
15
16[sim_specs]
17    sim_f = "simulator.sim_find_sine"
18    inputs = ["x"]
19    [sim_specs.outputs]
20        [sim_specs.outputs.y]
21            type = "float"
 1{
 2    "libE_specs": {
 3        "save_every_k_gens": 300,
 4    },
 5    "exit_criteria": {
 6        "sim_max": 80
 7    },
 8    "gen_specs": {
 9        "gen_f": "generator.gen_random_sample",
10        "outputs": {
11            "x": {
12                "type": "float",
13                "size": 1
14            }
15        },
16        "user": {
17            "gen_batch_size": 5
18        }
19    },
20    "sim_specs": {
21        "sim_f": "simulator.sim_find_sine",
22        "inputs": ["x"],
23        "outputs": {
24            "f": {"type": "float"}
25        }
26    }
27}
Parameters:
  • sim_specs (dict or SimSpecs) – Specifications for the simulation function

  • gen_specs (dict or GenSpecs, Optional) – Specifications for the generator function

  • exit_criteria (dict or ExitCriteria, Optional) – Tell libEnsemble when to stop a run

  • libE_specs (dict or LibeSpecs, Optional) – Specifications for libEnsemble

  • alloc_specs (dict or AllocSpecs, Optional) – Specifications for the allocation function

  • persis_info (dict, Optional) – Persistent information to be passed between user function instances (example)

  • executor (Executor, Optional) – libEnsemble Executor instance for use within simulation or generator functions

  • H0 (NumPy structured array, Optional) – A libEnsemble history to be prepended to this run’s history (example)

  • parse_args (bool, Optional) – Read nworkers, comms, and other arguments from the command-line. For MPI, calculate nworkers and set the is_manager Boolean attribute on MPI rank 0. See the parse_args docs for more information.

ready()

Quickly verify that all necessary data has been provided

Return type:

bool

run()

Initializes libEnsemble.

MPI/comms Notes

Manager–worker intercommunications are parsed from the comms key of libE_specs. An MPI runtime is assumed by default if --comms local wasn’t specified on the command-line or in libE_specs.

If a MPI communicator was provided in libE_specs, then each .run() call will initiate intercommunications on a duplicate of that communicator. Otherwise, a duplicate of COMM_WORLD will be used.

Returns:

  • H (NumPy structured array) – History array storing rows for each point. (example)

  • persis_info (dict) – Final state of persistent information (example)

  • exit_flag (int) – Flag containing final task status

    0 = No errors
    1 = Exception occurred
    2 = Manager timed out and ended simulation
    3 = Current process is not in libEnsemble MPI communicator
    

Return type:

(numpy.ndarray[Any, numpy.dtype[+_ScalarType_co]], <class ‘dict’>, <class ‘int’>)

from_yaml(file_path)

Parameterizes libEnsemble from yaml file

Parameters:

file_path (str)

from_toml(file_path)

Parameterizes libEnsemble from toml file

Parameters:

file_path (str)

from_json(file_path)

Parameterizes libEnsemble from json file

Parameters:

file_path (str)

add_random_streams(num_streams=0, seed='')

Adds np.random generators for each worker ID to self.persis_info.

Parameters:
  • num_streams (int, Optional) – Number of matching worker ID and random stream entries to create. Defaults to self.nworkers.

  • seed (str, Optional) – Seed for NumPy’s RNG.

save_output(file)

Writes out History array and persis_info to files. If using a workflow_dir, will place with specified filename in that directory.

Format: <calling_script>_results_History_length=<length>_evals=<Completed evals>_ranks=<nworkers>

Parameters:

file (str)

The libE module is the outer libEnsemble routine.

This module sets up the manager and the team of workers, configured according to the contents of libE_specs. The manager/worker communications scheme used in libEnsemble is parsed from the comms key if present, with valid values being mpi, local (for multiprocessing), or tcp.

MPI is the default if nworkers is not given. However, if libE_specs["nworkers"] is specified, then local comms will be used unless a parallel MPI environment is detected.

For mpi comms, if a communicator is specified, each call to this module will initiate manager/worker communications on a duplicate of that communicator. Otherwise, a duplicate of COMM_WORLD will be used.

In the vast majority of cases, programming with libEnsemble involves the creation of a calling script, a Python file where libEnsemble is parameterized via the various specification dictionaries (e.g. libE_specs, sim_specs, and gen_specs). The outer libEnsemble routine libE() is imported and called with such dictionaries to initiate libEnsemble. A simple calling script (from the first tutorial) may resemble:

 1import numpy as np
 2from libensemble.libE import libE
 3from generator import gen_random_sample
 4from simulator import sim_find_sine
 5from libensemble.tools import add_unique_random_streams
 6
 7nworkers, is_manager, libE_specs, _ = parse_args()
 8
 9libE_specs["save_every_k_gens"] = 20
10
11gen_specs = {
12    "gen_f": gen_random_sample,
13    "out": [("x", float, (1,))],
14    "user": {"lower": np.array([-3]), "upper": np.array([3]), "gen_batch_size": 5},
15}
16
17sim_specs = {"sim_f": sim_find_sine, "in": ["x"], "out": [("y", float)]}
18
19persis_info = add_unique_random_streams({}, nworkers + 1)
20
21exit_criteria = {"sim_max": 80}
22
23H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, libE_specs=libE_specs)

This will initiate libEnsemble with a Manager and nworkers workers (parsed from the command line), and runs on laptops or supercomputers. If an exception is encountered by the manager or workers, the history array is dumped to file, and MPI abort is called.

On macOS (since Python 3.8) and Windows, the default multiprocessing start method is "spawn" and you must place most calling script code (or just libE() / Ensemble().run() at a minimum) in an if __name__ == "__main__:" block.

Therefore a calling script that is universal across all platforms and comms-types may resemble:

 1import numpy as np
 2from libensemble.libE import libE
 3from generator import gen_random_sample
 4from simulator import sim_find_sine
 5from libensemble.tools import add_unique_random_streams
 6
 7if __name__ == "__main__":
 8    nworkers, is_manager, libE_specs, _ = parse_args()
 9
10    libE_specs["save_every_k_gens"] = 20
11
12    gen_specs = {
13        "gen_f": gen_random_sample,
14        "out": [("x", float, (1,))],
15        "user": {
16            "lower": np.array([-3]),
17            "upper": np.array([3]),
18            "gen_batch_size": 5,
19        },
20    }
21
22    sim_specs = {
23        "sim_f": sim_find_sine,
24        "in": ["x"],
25        "out": [("y", float)],
26    }
27
28    persis_info = add_unique_random_streams({}, nworkers + 1)
29
30    exit_criteria = {"sim_max": 80}
31
32    H, persis_info, flag = libE(sim_specs, gen_specs, exit_criteria, persis_info, libE_specs=libE_specs)

Alternatively, you may set the multiprocessing start method to "fork" via the following:

1from multiprocessing import set_start_method
2
3set_start_method("fork")

But note that this is incompatible with some libraries.

See below for the complete traditional API.

libensemble.libE.libE(sim_specs, gen_specs, exit_criteria, persis_info={}, alloc_specs=AllocSpecs(alloc_f=<function give_sim_work_first>, user={'num_active_gens': 1}, outputs=[]), libE_specs={}, H0=None)
Parameters:
  • sim_specs (dict or SimSpecs) – Specifications for the simulation function (example)

  • gen_specs (dict or GenSpecs, Optional) – Specifications for the generator function (example)

  • exit_criteria (dict or ExitCriteria, Optional) – Tell libEnsemble when to stop a run (example)

  • persis_info (dict, Optional) – Persistent information to be passed between user functions (example)

  • alloc_specs (dict or AllocSpecs, Optional) – Specifications for the allocation function (example)

  • libE_specs (dict or LibeSpecs, Optional) – Specifications for libEnsemble (example)

  • H0 – A libEnsemble history to be prepended to this run’s history (example)

Returns:

  • H (NumPy structured array) – History array storing rows for each point. (example)

  • persis_info (dict) – Final state of persistent information (example)

  • exit_flag (int) – Flag containing final task status

    0 = No errors
    1 = Exception occurred
    2 = Manager timed out and ended simulation
    3 = Current process is not in libEnsemble MPI communicator
    

Return type:

(<class ‘numpy.ndarray’>, Dict, <class ‘int’>)