Allocation Functions

Although the included allocation functions are sufficient for most users, those who want to fine-tune how data or resources are allocated to their generator or simulator can write their own.

The alloc_f is unique since it is called by libEnsemble’s manager instead of a worker.

For allocation functions, as with the other user functions, the level of complexity can vary widely. We encourage experimenting with:

  1. Prioritization of simulations

  2. Sending results immediately or in batch

  3. Assigning varying resources to evaluations

Example
libensemble.alloc_funcs.fast_alloc.give_sim_work_first
from libensemble.tools.alloc_support import AllocSupport, InsufficientFreeResources


def give_sim_work_first(W, H, sim_specs, gen_specs, alloc_specs, persis_info, libE_info):
    """
    This allocation function gives (in order) entries in ``H`` to idle workers
    to evaluate in the simulation function. The fields in ``sim_specs["in"]``
    are given. If all entries in `H` have been given a be evaluated, a worker
    is told to call the generator function, provided this wouldn't result in
    more than ``alloc_specs["user"]["num_active_gen"]`` active generators.

    This fast_alloc variation of give_sim_work_first is useful for cases that
    simply iterate through H, issuing evaluations in order and, in particular,
    is likely to be faster if there will be many short simulation evaluations,
    given that this function contains fewer column length operations.

    tags: alloc, simple, fast

    .. seealso::
        `test_fast_alloc.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_fast_alloc.py>`_ # noqa
    """

    if libE_info["sim_max_given"] or not libE_info["any_idle_workers"]:
        return {}, persis_info

    user = alloc_specs.get("user", {})
    manage_resources = libE_info["use_resource_sets"]

    support = AllocSupport(W, manage_resources, persis_info, libE_info)

    gen_count = support.count_gens()
    Work = {}
    gen_in = gen_specs.get("in", [])

    # Give sim work if possible
    for wid in support.avail_worker_ids(gen_workers=False):
        persis_info = support.skip_canceled_points(H, persis_info)
        if persis_info["next_to_give"] < len(H):
            try:
                Work[wid] = support.sim_work(wid, H, sim_specs["in"], [persis_info["next_to_give"]], [])
            except InsufficientFreeResources:
                break
            persis_info["next_to_give"] += 1

    # Give gen work if possible
    if persis_info["next_to_give"] >= len(H):
        for wid in support.avail_worker_ids(gen_workers=True):
            if wid not in Work and gen_count < user.get("num_active_gens", gen_count + 1):
                return_rows = range(len(H)) if gen_in else []
                try:
                    Work[wid] = support.gen_work(wid, gen_in, return_rows, persis_info.get(wid))
                except InsufficientFreeResources:
                    break
                gen_count += 1
                persis_info["total_gen_calls"] += 1

    return Work, persis_info

Most alloc_f function definitions written by users resemble:

def my_allocator(W, H, sim_specs, gen_specs, alloc_specs, persis_info, libE_info):

where:

  • W is an array containing worker state info

  • H is the trimmed History array, containing rows from the generator

  • libE_info is a set of statistics to determine the progress of work or exit conditions

Most users first check that it is appropriate to allocate work:

if libE_info["sim_max_given"] or not libE_info["any_idle_workers"]:
    return {}, persis_info

If the allocation is to continue, a support class is instantiated and a Work dictionary is initialized:

manage_resources = "resource_sets" in H.dtype.names or libE_info["use_resource_sets"]
support = AllocSupport(W, manage_resources, persis_info, libE_info)
Work = {}

This Work dictionary is populated with integer keys wid for each worker and dictionary values to give to those workers:

Example Work
{
    1: {
        "H_fields": ["x"],
        "persis_info": {"rand_stream": RandomState(...) at ..., "worker_num": 1},
        "tag": 1,
        "libE_info": {"H_rows": array([368])}
    },

    2: {
        "H_fields": ["x"],
        "persis_info": {"rand_stream": RandomState(...) at ..., "worker_num": 2},
        "tag": 1,
        "libE_info": {"H_rows": array([369])}
    },

    3: {
        "H_fields": ["x"],
        "persis_info": {"rand_stream": RandomState(...) at ..., "worker_num": 3},
        "tag": 1,
        "libE_info": {"H_rows": array([370])}
    },
    ...

}

This Work dictionary instructs each worker to call the sim_f (tag: 1) with data from "x" and a given "H_row" from the History array. A worker-specific persis_info is also given.

Constructing these arrays and determining which workers are available for receiving data is simplified by the AllocSupport class available within the libensemble.tools.alloc_support module:

AllocSupport
class libensemble.tools.alloc_support.AllocSupport(W, manage_resources=False, persis_info={}, libE_info={}, user_resources=None, user_scheduler=None)

A helper class to assist with writing allocation functions.

This class contains methods for common operations like populating work units, determining which workers are available, evaluating what values need to be distributed to workers, and others.

Note that since the alloc_f is called periodically by the Manager, this class instance (if used) will be recreated/destroyed on each loop.

__init__(W, manage_resources=False, persis_info={}, libE_info={}, user_resources=None, user_scheduler=None)

Instantiate a new AllocSupport instance

W is passed in for convenience on init; it is referenced by the various methods, but never modified.

By default, an AllocSupport instance uses any initiated libEnsemble resource module and the built-in libEnsemble scheduler.

Parameters:
  • W – A Worker array

  • manage_resources – (Optional) Boolean for if to assign resource sets when creating work units.

  • persis_info – (Optional) A dictionary of persistent information..

  • scheduler_opts – (Optional) A dictionary of options to pass to the resource scheduler.

  • user_resources – (Optional) A user supplied resources object.

  • user_scheduler – (Optional) A user supplied user_scheduler object.

assign_resources(rsets_req, use_gpus=None, user_params=[])

Schedule resource sets to a work record if possible.

For default scheduler, if more than one group (node) is required, will try to find even split, otherwise allocates whole nodes.

Raises InsufficientFreeResources if the required resources are not currently available, or InsufficientResourcesError if the required resources do not exist.

Parameters:
  • rsets_req – Int. Number of resource sets to request.

  • use_gpus – Bool. Whether to use GPU resource sets.

  • user_params – List of Integers. User parameters num_procs, num_gpus.

Returns:

List of Integers. Resource set indices assigned.

avail_worker_ids(persistent=None, active_recv=False, zero_resource_workers=None, gen_workers=None)

Returns available workers as a list of IDs, filtered by the given options.

Parameters:
  • persistent – (Optional) Int. Only return workers with given persis_state (1=sim, 2=gen).

  • active_recv – (Optional) Boolean. Only return workers with given active_recv state.

  • zero_resource_workers – (Optional) Boolean. Only return workers that require no resources.

  • gen_workers – (Optional) Boolean. If True, return gen-only workers. If False, return all other workers.

Returns:

List of worker IDs.

If there are no zero resource workers defined, then the zero_resource_workers argument will be ignored.

count_gens()

Returns the number of active generators.

test_any_gen()

Returns True if a generator worker is active.

count_persis_gens()

Return the number of active persistent generators.

sim_work(wid, H, H_fields, H_rows, persis_info, **libE_info)

Add sim work record to given Work dictionary.

Includes evaluation of required resources if the worker is not in a persistent state.

Parameters:
  • wid – Int. Worker ID.

  • HHistory array. For parsing out requested resource sets.

  • H_fields – Which fields from H to send.

  • H_rows – Which rows of H to send.

  • persis_info – Worker specific persis_info dictionary.

Returns:

a Work entry.

Additional passed parameters are inserted into libE_info in the resulting work record.

If rset_team is passed as an additional parameter, it will be honored, assuming that any resource checking has already been done.

gen_work(wid, H_fields, H_rows, persis_info, **libE_info)

Add gen work record to given Work dictionary.

Includes evaluation of required resources if the worker is not in a persistent state.

Parameters:
  • WorkWork dictionary.

  • wid – Worker ID.

  • H_fields – Which fields from H to send.

  • H_rows – Which rows of H to send.

  • persis_info – Worker specific persis_info dictionary.

Returns:

A Work entry.

Additional passed parameters are inserted into libE_info in the resulting work record.

If rset_team is passed as an additional parameter, it will be honored, and assume that any resource checking has already been done. For example, passing rset_team=[], would ensure that no resources are assigned.

all_sim_started(H, pt_filter=None, low_bound=None)

Returns True if all expected points have started their sim.

Excludes cancelled points.

Parameters:
  • pt_filter – (Optional) Boolean array filtering expected returned points in H.

  • low_bound – (Optional) Lower bound for testing all returned.

Returns:

True if all expected points have started their sim.

all_sim_ended(H, pt_filter=None, low_bound=None)

Returns True if all expected points have had their sim_end.

Excludes cancelled points that were not already sim_started.

Parameters:
  • pt_filter – (Optional) Boolean array filtering expected returned points in H.

  • low_bound – (Optional) Lower bound for testing all returned.

Returns:

True if all expected points have had their sim_end.

all_gen_informed(H, pt_filter=None, low_bound=None)

Returns True if gen has been informed of all expected points.

Excludes cancelled points that were not already given out.

Parameters:
  • pt_filter – (Optional) Boolean array filtering expected sim_end points in H.

  • low_bound – (Optional) Lower bound for testing all returned.

Returns:

True if gen have been informed of all expected points.

points_by_priority(H, points_avail, batch=False)

Returns indices of points to give by priority.

Parameters:
  • points_avail – Indices of points that are available to give.

  • batch – (Optional) Boolean. Should batches of points with the same priority be given simultaneously.

Returns:

An array of point indices to give.

skip_canceled_points(H, persis_info)

Increments the “next_to_give” field in persis_info to skip any cancelled points

The Work dictionary is returned to the manager alongside persis_info. If 1 is returned as the third value, this instructs the ensemble to stop.

Note

An error occurs when the alloc_f returns nothing while all workers are idle

Information from the manager describing the progress of the current libEnsemble routine can be found in libE_info:

libE_info =  {"exit_criteria": dict,               # Criteria for ending routine
              "elapsed_time": float,               # Time elapsed since start of routine
              "manager_kill_canceled_sims": bool,  # True if manager is to send kills to cancelled simulations
              "sim_started_count": int,            # Total number of points given for simulation function evaluation
              "sim_ended_count": int,              # Total number of points returned from simulation function evaluations
              "gen_informed_count": int,           # Total number of evaluated points given back to a generator function
              "sim_max_given": bool,               # True if `sim_max` simulations have been given out to workers
              "use_resource_sets": bool}           # True if num_resource_sets has been explicitly set.

Most often, the allocation function will just return once sim_max_given is True, but the user could choose to do something different, such as cancel points or keep returning completed points to the generator.

Generators that construct models based on all evaluated points, for example, may need simulation work units at the end of an ensemble to be returned to the generator anyway.

Alternatively, users can use elapsed_time to track runtime inside their allocation function and detect impending timeouts, then pack up cleanup work requests, or mark points for cancellation.

The remaining values above are useful for efficient filtering of H values (e.g., sim_ended_count saves filtering by an entire column of H.)

Descriptions of included allocation functions can be found here. The default allocation function is give_sim_work_first. During its worker ID loop, it checks if there’s unallocated work and assigns simulations for that work. Otherwise, it initializes generators for up to "num_active_gens" instances. Other settings like batch_mode are also supported. See here for more information about give_sim_work_first.