Allocation Functions

Although the included allocation functions, or alloc_f’s are sufficient for most users, those who want to fine-tune how data or resources are allocated to their gen_f and sim_f can write their own. The alloc_f is unique since it is called by the libEnsemble’s manager instead of a worker.

Most alloc_f function definitions written by users resemble:

def my_allocator(W, H, sim_specs, gen_specs, alloc_specs, persis_info):

Where W is an array containing information about each worker’s state, and H is the trimmed History array, containing rows initialized by the generator.

Inside an alloc_f, a Work dictionary is instantiated:

Work = {}

then populated with integer keys i for each worker and dictionary values to give to those workers. An example Work dictionary from a run of the test_1d_sampling.py regression test resembles:

{
    1: {
        'H_fields': ['x'],
        'persis_info': {'rand_stream': RandomState(...) at ..., 'worker_num': 1},
        'tag': 1,
        'libE_info': {'H_rows': array([368])}
    },

    2: {
        'H_fields': ['x'],
        'persis_info': {'rand_stream': RandomState(...) at ..., 'worker_num': 2},
        'tag': 1,
        'libE_info': {'H_rows': array([369])}
    },

    3: {
        'H_fields': ['x'],
        'persis_info': {'rand_stream': RandomState(...) at ..., 'worker_num': 3},
        'tag': 1,
        'libE_info': {'H_rows': array([370])}
    },

    4: {
        'H_fields': ['x'],
        'persis_info': {'rand_stream': RandomState(...) at ..., 'worker_num': 4},
        'tag': 1,
        'libE_info': {'H_rows': array([371])}
    }
}

Based on information from the API reference above, this Work dictionary describes instructions for each of the four workers to call the sim_f with data from the 'x' field and a given 'H_row' from the History array, and also pass persis_info.

Constructing these arrays and determining which workers are available for receiving data is simplified by several functions available within the libensemble.tools.alloc_support module:

libensemble.tools.alloc_support.avail_worker_ids(W, persistent=None, active_recv=False)

Returns available workers (active == 0), as an array, filtered by persis_state.

Parameters
  • WWorker array

  • persistent – Optional Boolean. If specified, also return workers with given persis_state.

Many alloc_f routines loop over the available workers returned by the above function to construct their Work dictionaries with the help of the following two functions.

libensemble.tools.alloc_support.sim_work(Work, i, H_fields, H_rows, persis_info, **libE_info)

Add sim work record to given Work array.

Parameters
  • WWorker array

  • i – Worker ID.

  • H_fields – Which fields from H to send

  • persis_info – current persis_info dictionary

Returns

None

libensemble.tools.alloc_support.gen_work(Work, i, H_fields, H_rows, persis_info, **libE_info)

Add gen work record to given Work array.

Parameters
  • WWorker array

  • i – Worker ID.

  • H_fields – Which fields from H to send

  • persis_info – current persis_info dictionary

Returns

None

Note that these two functions append an entry in-place to the Work dictionary and additional parameters are appended to libE_info.

In practice, the structure of many allocation functions resemble:

Work = {}
...
for ID in avail_worker_ids(W):
    ...
    if some_condition:
        sim_work(Work, ID, chosen_H_fields, chosen_H_rows, persis_info)
        ...

    if another_condition:
        gen_work(Work, ID, chosen_H_fields, chosen_H_rows, persis_info)
        ...

return Work, persis_info

The Work dictionary is returned to the manager alongside persis_info. If 1 is returned as third value, this instructs the run to stop.

For allocation functions, as with the user functions, the level of complexity can vary widely. Various scheduling and work distribution features are available in the existing allocation functions, including prioritization of simulations, returning evaluation outputs to the generator immediately or in batch, assigning varying resource sets to evaluations, and other methods of fine-tuned control over the data available to other user functions.

Note

An error occurs when the alloc_f returns nothing while all workers are idle

The final three functions available in the alloc_support module are primarily for evaluating running generators:

libensemble.tools.alloc_support.test_any_gen(W)

Return True if a generator worker is active.

Parameters

WWorker array

libensemble.tools.alloc_support.count_gens(W)

Return the number of active generators in a set of workers.

Parameters

WWorker array

libensemble.tools.alloc_support.count_persis_gens(W)

Return the number of active persistent generators in a set of workers.

Parameters

WWorker array

Descriptions of included allocation functions can be found here. The default allocation function used by libEnsemble if one isn’t specified is give_sim_work_first. During its worker ID loop, it checks if there’s unallocated work and assigns simulations for that work if so. Otherwise, it initializes generators for up to 'num_active_gens' instances. Other settings like batch_mode and blocking of non-active workers is also supported. See here for more information about give_sim_work_first.

For a shorter, simpler example, here is the fast_alloc allocation function:

/libensemble/alloc_funcs/fast_alloc.py
from libensemble.tools.alloc_support import avail_worker_ids, sim_work, gen_work, count_gens


def give_sim_work_first(W, H, sim_specs, gen_specs, alloc_specs, persis_info):
    """
    This allocation function gives (in order) entries in ``H`` to idle workers
    to evaluate in the simulation function. The fields in ``sim_specs['in']``
    are given. If all entries in `H` have been given a be evaluated, a worker
    is told to call the generator function, provided this wouldn't result in
    more than ``alloc_specs['user']['num_active_gen']`` active generators.

    This fast_alloc variation of give_sim_work_first is useful for cases that
    simply iterate through H, issuing evaluations in order and, in particular,
    is likely to be faster if there will be many short simulation evaluations,
    given that this function contains fewer column length operations.

    .. seealso::
        `test_fast_alloc.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/regression_tests/test_fast_alloc.py>`_ # noqa
    """

    Work = {}
    gen_count = count_gens(W)

    for i in avail_worker_ids(W):
        # Skip any cancelled points
        while persis_info['next_to_give'] < len(H) and H[persis_info['next_to_give']]['cancel_requested']:
            persis_info['next_to_give'] += 1

        # Give sim work if possible
        if persis_info['next_to_give'] < len(H):

            sim_work(Work, i, sim_specs['in'], [persis_info['next_to_give']], [])
            persis_info['next_to_give'] += 1

        elif gen_count < alloc_specs['user'].get('num_active_gens', gen_count+1):

            # Give gen work
            persis_info['total_gen_calls'] += 1
            gen_count += 1
            gen_in = gen_specs.get('in', [])
            return_rows = range(len(H)) if gen_in else []
            gen_work(Work, i, gen_in, return_rows, persis_info.get(i))

    return Work, persis_info