sampling

This module contains multiple generation functions for sampling a domain. All use (and return) a random stream in persis_info, given by the allocation function.

sampling.uniform_random_sample(_, persis_info, gen_specs): Generates gen_specs["user"]["gen_batch_size"] points uniformly over the domain defined by gen_specs["user"]["ub"] and gen_specs["user"]["lb"].

See also

test_uniform_sampling.py # noqa

sampling.uniform_random_sample_with_variable_resources(_, persis_info, gen_specs)

Generates gen_specs["user"]["gen_batch_size"] points uniformly over the domain defined by gen_specs["user"]["ub"] and gen_specs["user"]["lb"].

Also randomly requests a different number of resource sets to be used in each evaluation.

This generator is used to test/demonstrate setting of resource sets.

#.. seealso::: #`test_uniform_sampling_with_variable_resources.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_uniform_sampling_with_variable_resources.py>`_ # noqa

sampling.uniform_random_sample_with_var_priorities_and_resources(H, persis_info, gen_specs)

Generates points uniformly over the domain defined by gen_specs["user"]["ub"] and gen_specs["user"]["lb"]. Also, randomly requests a different priority and number of resource sets to be used in the evaluation of the generated points, after the initial batch.

This generator is used to test/demonstrate setting of priorities and resource sets.

sampling.uniform_random_sample_obj_components(H, persis_info, gen_specs): Generates points uniformly over the domain defined by gen_specs["user"]["ub"] and gen_specs["user"]["lb"] but requests each obj_component be evaluated separately.

See also

test_uniform_sampling_one_residual_at_a_time.py # noqa

sampling.latin_hypercube_sample(_, persis_info, gen_specs): Generates gen_specs["user"]["gen_batch_size"] points in a Latin hypercube sample over the domain defined by gen_specs["user"]["ub"] and gen_specs["user"]["lb"].

See also

test_1d_sampling.py # noqa

sampling.uniform_random_sample_cancel(_, persis_info, gen_specs): Similar to uniform_random_sample but with immediate cancellation of selected points for testing.

sampling.py

"""
This module contains multiple generation functions for sampling a domain. All
use (and return) a random stream in ``persis_info``, given by the allocation
function.
"""
import numpy as np

__all__ = [
    "uniform_random_sample",
    "uniform_random_sample_with_variable_resources",
    "uniform_random_sample_with_var_priorities_and_resources",
    "uniform_random_sample_obj_components",
    "latin_hypercube_sample",
    "uniform_random_sample_cancel",
]


def uniform_random_sample(_, persis_info, gen_specs):
    """
    Generates ``gen_specs["user"]["gen_batch_size"]`` points uniformly over the domain
    defined by ``gen_specs["user"]["ub"]`` and ``gen_specs["user"]["lb"]``.

    .. seealso::
        `test_uniform_sampling.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_uniform_sampling.py>`_ # noqa
    """
    ub = gen_specs["user"]["ub"]
    lb = gen_specs["user"]["lb"]

    n = len(lb)
    b = gen_specs["user"]["gen_batch_size"]

    H_o = np.zeros(b, dtype=gen_specs["out"])

    H_o["x"] = persis_info["rand_stream"].uniform(lb, ub, (b, n))

    return H_o, persis_info


def uniform_random_sample_with_variable_resources(_, persis_info, gen_specs):
    """
    Generates ``gen_specs["user"]["gen_batch_size"]`` points uniformly over the domain
    defined by ``gen_specs["user"]["ub"]`` and ``gen_specs["user"]["lb"]``.

    Also randomly requests a different number of resource sets to be used in each evaluation.

    This generator is used to test/demonstrate setting of resource sets.

    #.. seealso::
        #`test_uniform_sampling_with_variable_resources.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_uniform_sampling_with_variable_resources.py>`_ # noqa
    """

    ub = gen_specs["user"]["ub"]
    lb = gen_specs["user"]["lb"]
    max_rsets = gen_specs["user"]["max_resource_sets"]

    n = len(lb)
    b = gen_specs["user"]["gen_batch_size"]

    H_o = np.zeros(b, dtype=gen_specs["out"])

    H_o["x"] = persis_info["rand_stream"].uniform(lb, ub, (b, n))
    H_o["resource_sets"] = persis_info["rand_stream"].integers(1, max_rsets + 1, b)

    print(f'GEN: H rsets requested: {H_o["resource_sets"]}')

    return H_o, persis_info


def uniform_random_sample_with_var_priorities_and_resources(H, persis_info, gen_specs):
    """
    Generates points uniformly over the domain defined by ``gen_specs["user"]["ub"]`` and
    ``gen_specs["user"]["lb"]``. Also, randomly requests a different priority and number of
    resource sets to be used in the evaluation of the generated points, after the initial batch.

    This generator is used to test/demonstrate setting of priorities and resource sets.

    """
    ub = gen_specs["user"]["ub"]
    lb = gen_specs["user"]["lb"]
    max_rsets = gen_specs["user"]["max_resource_sets"]

    n = len(lb)

    if len(H) == 0:
        b = gen_specs["user"]["initial_batch_size"]

        H_o = np.zeros(b, dtype=gen_specs["out"])
        for i in range(0, b):
            # x= i*np.ones(n)
            x = persis_info["rand_stream"].uniform(lb, ub, (1, n))
            H_o["x"][i] = x
            H_o["resource_sets"][i] = 1
            H_o["priority"] = 1

    else:
        H_o = np.zeros(1, dtype=gen_specs["out"])
        # H_o["x"] = len(H)*np.ones(n)  # Can use a simple count for testing.
        H_o["x"] = persis_info["rand_stream"].uniform(lb, ub)
        H_o["resource_sets"] = persis_info["rand_stream"].integers(1, max_rsets + 1)
        H_o["priority"] = 10 * H_o["resource_sets"]
        # print("Created sim for {} resource sets".format(H_o["resource_sets"]), flush=True)

    return H_o, persis_info


def uniform_random_sample_obj_components(H, persis_info, gen_specs):
    """
    Generates points uniformly over the domain defined by ``gen_specs["user"]["ub"]``
    and ``gen_specs["user"]["lb"]`` but requests each ``obj_component`` be evaluated
    separately.

    .. seealso::
        `test_uniform_sampling_one_residual_at_a_time.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_uniform_sampling_one_residual_at_a_time.py>`_ # noqa
    """
    ub = gen_specs["user"]["ub"]
    lb = gen_specs["user"]["lb"]

    n = len(lb)
    m = gen_specs["user"]["components"]
    b = gen_specs["user"]["gen_batch_size"]

    H_o = np.zeros(b * m, dtype=gen_specs["out"])
    for i in range(0, b):
        x = persis_info["rand_stream"].uniform(lb, ub, (1, n))
        H_o["x"][i * m : (i + 1) * m, :] = np.tile(x, (m, 1))
        H_o["priority"][i * m : (i + 1) * m] = persis_info["rand_stream"].uniform(0, 1, m)
        H_o["obj_component"][i * m : (i + 1) * m] = np.arange(0, m)

        H_o["pt_id"][i * m : (i + 1) * m] = len(H) // m + i

    return H_o, persis_info


def uniform_random_sample_cancel(_, persis_info, gen_specs):
    """
    Similar to uniform_random_sample but with immediate cancellation of
    selected points for testing.

    """
    ub = gen_specs["user"]["ub"]
    lb = gen_specs["user"]["lb"]

    n = len(lb)
    b = gen_specs["user"]["gen_batch_size"]

    H_o = np.zeros(b, dtype=gen_specs["out"])
    for i in range(b):
        if i % 10 == 0:
            H_o[i]["cancel_requested"] = True

    H_o["x"] = persis_info["rand_stream"].uniform(lb, ub, (b, n))

    return H_o, persis_info


def latin_hypercube_sample(_, persis_info, gen_specs):
    """
    Generates ``gen_specs["user"]["gen_batch_size"]`` points in a Latin
    hypercube sample over the domain defined by ``gen_specs["user"]["ub"]`` and
    ``gen_specs["user"]["lb"]``.

    .. seealso::
        `test_1d_sampling.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/regression_tests/test_1d_sampling.py>`_ # noqa
    """

    ub = gen_specs["user"]["ub"]
    lb = gen_specs["user"]["lb"]

    n = len(lb)
    b = gen_specs["user"]["gen_batch_size"]

    H_o = np.zeros(b, dtype=gen_specs["out"])

    A = lhs_sample(n, b, persis_info["rand_stream"])

    H_o["x"] = A * (ub - lb) + lb

    return H_o, persis_info


def lhs_sample(n, k, stream):
    # Generate the intervals and random values
    intervals = np.linspace(0, 1, k + 1)
    rand_source = stream.uniform(0, 1, (k, n))
    rand_pts = np.zeros((k, n))
    sample = np.zeros((k, n))

    # Add a point uniformly in each interval
    a = intervals[:k]
    b = intervals[1:]
    for j in range(n):
        rand_pts[:, j] = rand_source[:, j] * (b - a) + a

    # Randomly perturb
    for j in range(n):
        sample[:, j] = rand_pts[stream.permutation(k), j]

    return sample

persistent_sampling

Persistent generator providing points using sampling

persistent_sampling.persistent_uniform(_, persis_info, gen_specs, libE_info): This generation function always enters into persistent mode and returns gen_specs["initial_batch_size"] uniformly sampled points the first time it is called. Afterwards, it returns the number of points given. This can be used in either a batch or asynchronous mode by adjusting the allocation function.

See also

test_persistent_uniform_sampling.py test_persistent_uniform_sampling_async.py

persistent_sampling.persistent_uniform_final_update(_, persis_info, gen_specs, libE_info): Assuming the value "f" returned from sim_f is stochastic, this generation is updating an estimated mean "f_est" of the sim_f output at each of the corners of the domain.

See also

test_persistent_uniform_sampling_running_mean.py

persistent_sampling.persistent_request_shutdown(_, persis_info, gen_specs, libE_info): This generation function is similar in structure to persistent_uniform, but uses a count to test exiting on a threshold value. This principle can be used with a supporting allocation function (e.g. start_only_persistent) to shutdown an ensemble when a condition is met.

See also

test_persistent_uniform_gen_decides_stop.py

persistent_sampling.uniform_nonblocking(_, persis_info, gen_specs, libE_info): This generation function is designed to test non-blocking receives.

See also

test_persistent_uniform_sampling.py

persistent_sampling.batched_history_matching(_, persis_info, gen_specs, libE_info)

Given - sim_f with an input of x with len(x)=n - b, the batch size of points to generate - q<b, the number of best samples to use in the following iteration

Pseudocode: Let (mu, Sigma) denote a mean and covariance matrix initialized to the origin and the identity, respectively.

While true (batch synchronous for now):

Draw b samples x_1, … , x_b from MVN( mu, Sigma) Evaluate f(x_1), … , f(x_b) and determine the set of q x_i whose f(x_i) values are smallest (breaking ties lexicographically) Update (mu, Sigma) based on the sample mean and sample covariance of these q x values.

See also

test_persistent_uniform_sampling.py

persistent_sampling.persistent_uniform_with_cancellations(_, persis_info, gen_specs, libE_info)

persistent_sampling.py

"""Persistent generator providing points using sampling"""

import numpy as np

from libensemble.message_numbers import EVAL_GEN_TAG, FINISHED_PERSISTENT_GEN_TAG, PERSIS_STOP, STOP_TAG
from libensemble.tools.persistent_support import PersistentSupport

__all__ = [
    "persistent_uniform",
    "persistent_uniform_final_update",
    "persistent_request_shutdown",
    "uniform_nonblocking",
    "batched_history_matching",
    "persistent_uniform_with_cancellations",
]


def _get_user_params(user_specs):
    """Extract user params"""
    b = user_specs["initial_batch_size"]
    ub = user_specs["ub"]
    lb = user_specs["lb"]
    n = len(lb)  # dimension
    assert isinstance(b, int), "Batch size must be an integer"
    assert isinstance(n, int), "Dimension must be an integer"
    assert isinstance(lb, np.ndarray), "lb must be a numpy array"
    assert isinstance(ub, np.ndarray), "ub must be a numpy array"
    return b, n, lb, ub


def persistent_uniform(_, persis_info, gen_specs, libE_info):
    """
    This generation function always enters into persistent mode and returns
    ``gen_specs["initial_batch_size"]`` uniformly sampled points the first time it
    is called. Afterwards, it returns the number of points given. This can be
    used in either a batch or asynchronous mode by adjusting the allocation
    function.

    .. seealso::
        `test_persistent_uniform_sampling.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_persistent_uniform_sampling.py>`_
        `test_persistent_uniform_sampling_async.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_persistent_uniform_sampling_async.py>`_
    """  # noqa

    b, n, lb, ub = _get_user_params(gen_specs["user"])
    ps = PersistentSupport(libE_info, EVAL_GEN_TAG)

    # Send batches until manager sends stop tag
    tag = None
    while tag not in [STOP_TAG, PERSIS_STOP]:
        H_o = np.zeros(b, dtype=gen_specs["out"])
        H_o["x"] = persis_info["rand_stream"].uniform(lb, ub, (b, n))
        if "obj_component" in H_o.dtype.fields:
            H_o["obj_component"] = persis_info["rand_stream"].integers(
                low=0, high=gen_specs["user"]["num_components"], size=b
            )
        tag, Work, calc_in = ps.send_recv(H_o)
        if hasattr(calc_in, "__len__"):
            b = len(calc_in)

    return H_o, persis_info, FINISHED_PERSISTENT_GEN_TAG


def persistent_uniform_final_update(_, persis_info, gen_specs, libE_info):
    """
    Assuming the value ``"f"`` returned from sim_f is stochastic, this
    generation is updating an estimated mean ``"f_est"`` of the sim_f output at
    each of the corners of the domain.

    .. seealso::
        `test_persistent_uniform_sampling_running_mean.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_persistent_uniform_sampling_running_mean.py>`_
    """  # noqa

    b, n, lb, ub = _get_user_params(gen_specs["user"])
    ps = PersistentSupport(libE_info, EVAL_GEN_TAG)

    def generate_corners(x, y):
        n = len(x)
        corner_indices = np.arange(2**n)
        corners = []
        for index in corner_indices:
            corner = [x[i] if index & (1 << i) else y[i] for i in range(n)]
            corners.append(corner)
        return corners

    def sample_corners_with_probability(corners, p, b):
        selected_corners = np.random.choice(len(corners), size=b, p=p)
        sampled_corners = [corners[i] for i in selected_corners]
        return sampled_corners, selected_corners

    corners = generate_corners(lb, ub)

    # Start with equal probabilies
    p = np.ones(2**n) / 2**n

    running_total = np.nan * np.ones(2**n)
    number_of_samples = np.zeros(2**n)
    sent = np.array([], dtype=int)

    # Send batches of `b` points until manager sends stop tag
    tag = None
    next_id = 0
    while tag not in [STOP_TAG, PERSIS_STOP]:
        H_o = np.zeros(b, dtype=gen_specs["out"])
        H_o["sim_id"] = range(next_id, next_id + b)
        next_id += b

        sampled_corners, corner_ids = sample_corners_with_probability(corners, p, b)

        H_o["corner_id"] = corner_ids
        H_o["x"] = sampled_corners
        sent = np.append(sent, corner_ids)

        tag, Work, calc_in = ps.send_recv(H_o)
        if hasattr(calc_in, "__len__"):
            b = len(calc_in)
            for row in calc_in:
                number_of_samples[row["corner_id"]] += 1
                if np.isnan(running_total[row["corner_id"]]):
                    running_total[row["corner_id"]] = row["f"]
                else:
                    running_total[row["corner_id"]] += row["f"]

    # Having received a PERSIS_STOP, update f_est field for all points and return
    # For manager to honor final H_o return, must have set libE_specs["use_persis_return_gen"] = True
    f_est = running_total / number_of_samples
    H_o = np.zeros(len(sent), dtype=[("sim_id", int), ("corner_id", int), ("f_est", float)])
    for count, i in enumerate(sent):
        H_o["sim_id"][count] = count
        H_o["corner_id"][count] = i
        H_o["f_est"][count] = f_est[i]

    return H_o, persis_info, FINISHED_PERSISTENT_GEN_TAG


def persistent_request_shutdown(_, persis_info, gen_specs, libE_info):
    """
    This generation function is similar in structure to persistent_uniform,
    but uses a count to test exiting on a threshold value. This principle can
    be used with a supporting allocation function (e.g. start_only_persistent)
    to shutdown an ensemble when a condition is met.

    .. seealso::
        `test_persistent_uniform_gen_decides_stop.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_persistent_uniform_gen_decides_stop.py>`_
    """  # noqa
    b, n, lb, ub = _get_user_params(gen_specs["user"])
    shutdown_limit = gen_specs["user"]["shutdown_limit"]
    f_count = 0
    ps = PersistentSupport(libE_info, EVAL_GEN_TAG)

    # Send batches until manager sends stop tag
    tag = None
    while tag not in [STOP_TAG, PERSIS_STOP]:
        H_o = np.zeros(b, dtype=gen_specs["out"])
        H_o["x"] = persis_info["rand_stream"].uniform(lb, ub, (b, n))
        tag, Work, calc_in = ps.send_recv(H_o)
        if hasattr(calc_in, "__len__"):
            b = len(calc_in)
        f_count += b
        if f_count >= shutdown_limit:
            print("Reached threshold.", f_count, flush=True)
            break  # End the persistent gen

    return H_o, persis_info, FINISHED_PERSISTENT_GEN_TAG


def uniform_nonblocking(_, persis_info, gen_specs, libE_info):
    """
    This generation function is designed to test non-blocking receives.

    .. seealso::
        `test_persistent_uniform_sampling.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_persistent_uniform_sampling.py>`_
    """  # noqa
    b, n, lb, ub = _get_user_params(gen_specs["user"])
    ps = PersistentSupport(libE_info, EVAL_GEN_TAG)

    # Send batches until manager sends stop tag
    tag = None
    while tag not in [STOP_TAG, PERSIS_STOP]:
        H_o = np.zeros(b, dtype=gen_specs["out"])
        H_o["x"] = persis_info["rand_stream"].uniform(lb, ub, (b, n))
        ps.send(H_o)

        received = False
        spin_count = 0
        while not received:
            tag, Work, calc_in = ps.recv(blocking=False)
            if tag is not None:
                received = True
            else:
                spin_count += 1

        persis_info["spin_count"] = spin_count

        if hasattr(calc_in, "__len__"):
            b = len(calc_in)

    return H_o, persis_info, FINISHED_PERSISTENT_GEN_TAG


def batched_history_matching(_, persis_info, gen_specs, libE_info):
    """
    Given
    - sim_f with an input of x with len(x)=n
    - b, the batch size of points to generate
    - q<b, the number of best samples to use in the following iteration

    Pseudocode:
    Let (mu, Sigma) denote a mean and covariance matrix initialized to the
    origin and the identity, respectively.

    While true (batch synchronous for now):

        Draw b samples x_1, ... , x_b from MVN( mu, Sigma)
        Evaluate f(x_1), ... , f(x_b) and determine the set of q x_i whose f(x_i) values are smallest (breaking ties lexicographically)
        Update (mu, Sigma) based on the sample mean and sample covariance of these q x values.

    .. seealso::
        `test_persistent_uniform_sampling.py <https://github.com/Libensemble/libensemble/blob/develop/libensemble/tests/functionality_tests/test_persistent_uniform_sampling.py>`_
    """  # noqa
    lb = gen_specs["user"]["lb"]

    n = len(lb)
    b = gen_specs["user"]["initial_batch_size"]
    q = gen_specs["user"]["num_best_vals"]
    ps = PersistentSupport(libE_info, EVAL_GEN_TAG)

    mu = np.zeros(n)
    Sigma = np.eye(n)
    tag = None

    while tag not in [STOP_TAG, PERSIS_STOP]:
        H_o = np.zeros(b, dtype=gen_specs["out"])
        H_o["x"] = persis_info["rand_stream"].multivariate_normal(mu, Sigma, b)

        # Send data and get next assignment
        tag, Work, calc_in = ps.send_recv(H_o)
        if calc_in is not None:
            all_inds = np.argsort(calc_in["f"])
            best_inds = all_inds[:q]
            mu = np.mean(H_o["x"][best_inds], axis=0)
            Sigma = np.cov(H_o["x"][best_inds].T)

    return H_o, persis_info, FINISHED_PERSISTENT_GEN_TAG


def persistent_uniform_with_cancellations(_, persis_info, gen_specs, libE_info):
    ub = gen_specs["user"]["ub"]
    lb = gen_specs["user"]["lb"]
    n = len(lb)
    b = gen_specs["user"]["initial_batch_size"]

    # Start cancelling points from half initial batch onward
    cancel_from = b // 2  # Should get at least this many points back

    ps = PersistentSupport(libE_info, EVAL_GEN_TAG)

    # Send batches until manager sends stop tag
    tag = None
    while tag not in [STOP_TAG, PERSIS_STOP]:
        H_o = np.zeros(b, dtype=gen_specs["out"])
        H_o["x"] = persis_info["rand_stream"].uniform(lb, ub, (b, n))
        tag, Work, calc_in = ps.send_recv(H_o)

        if hasattr(calc_in, "__len__"):
            b = len(calc_in)

            # Cancel as many points as got back
            cancel_ids = list(range(cancel_from, cancel_from + b))
            cancel_from += b
            ps.request_cancel_sim_ids(cancel_ids)

    return H_o, persis_info, FINISHED_PERSISTENT_GEN_TAG

persistent_sampling_var_resources

Persistent random sampling using various methods of dynamic resource assignment

Each function generates points uniformly over the domain defined by gen_specs["user"]["ub"] and gen_specs["user"]["lb"].

Most functions use a random request of resources over a range, setting num_procs, num_gpus, or resource sets. The function uniform_sample_with_var_gpus uses the x value to determine the number of GPUs requested.

persistent_sampling_var_resources.uniform_sample(_, persis_info, gen_specs, libE_info): Randomly requests a different number of resource sets to be used in the evaluation of the generated points.

See also

test_uniform_sampling_with_variable_resources.py

persistent_sampling_var_resources.uniform_sample_with_procs_gpus(_, persis_info, gen_specs, libE_info): Randomly requests a different number of processors and gpus to be used in the evaluation of the generated points.

See also

test_GPU_variable_resources.py

persistent_sampling_var_resources.uniform_sample_with_var_priorities(_, persis_info, gen_specs, libE_info): Initial batch has matching priorities, after which a different number of resource sets and priorities are requested for each point.

persistent_sampling_var_resources.uniform_sample_diff_simulations(_, persis_info, gen_specs, libE_info): Randomly requests a different number of processors for each simulation. One simulation type also uses GPUs.

See also

test_GPU_variable_resources_multi_task.py

persistent_sampling_var_resources.uniform_sample_with_sim_gen_resources(_, persis_info, gen_specs, libE_info): Randomly requests a different number of processors and gpus to be used in the evaluation of the generated points.

See also

test_GPU_variable_resources.py