General Specs
libEnsemble is primarily customized by setting options within a LibeSpecs
class or dictionary.
from libensemble.specs import LibeSpecs
specs = LibeSpecs(
comm=MPI.COMM_WORLD,
comms="mpi",
save_every_k_gens=1000,
sim_dirs_make=True,
ensemble_dir_path="/scratch/ensemble",
)
Settings by Category
- “comms” [str] =
"mpi"
: Manager/Worker communications mode:
'mpi'
,'local'
, or'tcp'
.- “nworkers” [int]:
Number of worker processes in
"local"
or"tcp"
.- “mpi_comm” [MPI communicator] =
MPI.COMM_WORLD
: libEnsemble MPI communicator.
- “dry_run” [bool] =
False
: Whether libEnsemble should immediately exit after validating all inputs.
- “abort_on_exception” [bool] =
True
: In MPI mode, whether to call
MPI_ABORT
on an exception. IfFalse
, an exception will be raised by the manager.- “save_every_k_sims” [int]:
Save history array to file after every k simulated points.
- “save_every_k_gens” [int]:
Save history array to file after every k generated points.
- “save_H_and_persis_on_abort” [bool] =
True
: Save states of
H
andpersis_info
to file on aborting after an exception.- “worker_timeout” [int] =
1
: On libEnsemble shutdown, number of seconds after which workers considered timed out, then terminated.
- “kill_canceled_sims” [bool] =
False
: Try to kill sims with
"cancel_requested"
set toTrue
. IfFalse
, the manager avoids this moderate overhead.- “disable_log_files” [bool] =
False
: Disable
ensemble.log
andlibE_stats.txt
log files.
- “use_workflow_dir” [bool] =
False
: Whether to place all log files, dumped arrays, and default ensemble-directories in a separate
workflow
directory. Each run is suffixed with a hash. If copying back an ensemble directory from another location, the copy is placed here.- “workflow_dir_path” [str]:
Optional path to the workflow directory.
- “ensemble_dir_path” [str] =
"./ensemble"
: Path to main ensemble directory. Can serve as single working directory for workers, or contain calculation directories.
libE_specs["ensemble_dir_path"] = "/scratch/my_ensemble"
- “ensemble_copy_back” [bool] =
False
: Whether to copy back contents of
ensemble_dir_path
to launch location. Useful ifensemble_dir_path
is located on node-local storage.- “reuse_output_dir” [bool] =
False
: Whether to allow overwrites and access to previous ensemble and workflow directories in subsequent runs.
False
by default to protect results.- “calc_dir_id_width” [int] =
4
: The width of the numerical ID component of a calculation directory name. Leading zeros are padded to the sim/gen ID.
- “use_worker_dirs” [bool] =
False
: Whether to organize calculation directories under worker-specific directories:
- /ensemble_dir - /sim0 - /gen1 - /sim1 ...
- /ensemble_dir - /worker1 - /sim0 - /gen1 - /sim4 ... - /worker2 ...
- “sim_dirs_make” [bool] =
False
: Whether to make calculation directories for each simulation function call.
- “sim_dir_copy_files” [list]:
Paths to files or directories to copy into each sim directory, or ensemble directory. List of strings or
pathlib.Path
objects.- “sim_dir_symlink_files” [list]:
Paths to files or directories to symlink into each sim directory, or ensemble directory. List of strings or
pathlib.Path
objects.- “sim_input_dir” [str]:
Copy this directory’s contents into the working directory upon calling the simulation function.
- “gen_dirs_make” [bool] =
False
: Whether to make generator-specific calculation directories for each generator function call. Each persistent generator creates a single directory.
- “gen_dir_copy_files” [list]:
Paths to copy into the working directory upon calling the generator function. List of strings or
pathlib.Path
objects- “gen_dir_symlink_files” [list]:
Paths to files or directories to symlink into each gen directory. List of strings or
pathlib.Path
objects- “gen_input_dir” [str]:
Copy this directory’s contents into the working directory upon calling the generator function.
- “profile” [bool] =
False
: Profile manager and worker logic using
cProfile
.- “safe_mode” [bool] =
True
: Prevents user functions from overwriting internal fields, but requires moderate overhead.
- “stats_fmt” [dict]:
A dictionary of options for formatting
"libE_stats.txt"
. See “Formatting Options for libE_stats.txt”.
- “workers” [list]:
TCP Only: A list of worker hostnames.
- “ip” [str]:
TCP Only: IP address for Manager’s system.
- “port” [int]:
TCP Only: Port number for Manager’s system.
- “authkey” [str]:
TCP Only: Authkey for Manager’s system.
- “workerID” [int]:
TCP Only: Worker ID number assigned to the new process.
- “worker_cmd” [list]:
TCP Only: Split string corresponding to worker/client Python process invocation. Contains a local Python path, calling script, and manager/server format-fields for
manager_ip
,manager_port
,authkey
, andworkerID
.nworkers
is specified normally.
- “use_persis_return_gen” [bool] =
False
: Adds persistent generator output fields to the History array on return.
- “use_persis_return_sim” [bool] =
False
: Adds persistent simulator output fields to the History array on return.
- “final_gen_send” [bool] =
False
: Send final simulation results to persistent generators before shutdown. The results will be sent along with the
PERSIS_STOP
tag.
- “disable_resource_manager” [bool] =
False
: Disable the built-in resource manager, including automatic resource detection and/or assignment of resources to workers.
"resource_info"
will be ignored.- “platform” [str]:
Name of a known platform, e.g.,
libE_specs["platform"] = "perlmutter_g"
Alternatively set theLIBE_PLATFORM
environment variable.- “platform_specs” [Platform|dict]:
A
Platform
object (or dictionary) specifying settings for a platform.. Fields not provided will be auto-detected. Can be set to a known platform object.- “num_resource_sets” [int]:
The total number of resource sets into which resources will be divided. By default resources will be divided by workers (excluding
zero_resource_workers
).- “gen_num_procs” [int] =
0
: The default number of processors (MPI ranks) required by generators. Unless overridden by equivalent
persis_info
settings, generators will be allocated this many processors for applications launched via the MPIExecutor.- “gen_num_gpus” [int] =
0
: The default number of GPUs required by generators. Unless overridden by the equivalent
persis_info
settings, generators will be allocated this many GPUs.- “enforce_worker_core_bounds” [bool] =
False
: Permit submission of tasks with a higher processor count than the CPUs available to the worker. Larger node counts are not allowed. Ignored when
disable_resource_manager
is set.- “dedicated_mode” [bool] =
False
: Disallow any resources running libEnsemble processes (manager and workers) from being valid targets for app submissions.
- “zero_resource_workers” [list of ints]:
List of workers (by IDs) that require no resources. For when a fixed mapping of workers to resources is required. Otherwise, use
"num_resource_sets"
. For use with supported allocation functions.- “resource_info” [dict]:
Provide resource information that will override automatically detected resources. The allowable fields are given below in “Overriding Resource Auto-Detection” Ignored if
"disable_resource_manager"
is set.- “scheduler_opts” [dict]:
Options for the resource scheduler. See “Scheduler Options” for more options.
Complete Class API
- pydantic model libensemble.specs.LibeSpecs
Specifications for configuring libEnsemble’s runtime behavior.
- field abort_on_exception: Optional[bool] = True
In MPI mode, whether to call
MPI_ABORT
on an exception. IfFalse
, an exception will be raised by the manager.
- field authkey: Optional[str] = 'libE_auth_33770'
TCP Only: Authkey for Manager’s system.
- field calc_dir_id_width: Optional[int] = 4
The width of the numerical ID component of a calculation directory name. Leading zeros are padded to the sim/gen ID.
- field comms: Optional[str] = 'mpi'
Manager/Worker communications mode.
'mpi'
,'local'
, or'tcp'
- field dedicated_mode: Optional[bool] = False
Instructs libEnsemble to not run applications on resources where libEnsemble processes (manager and workers) are running.
- field disable_log_files: Optional[bool] = False
Disable
ensemble.log
andlibE_stats.txt
log files.
- field disable_resource_manager: Optional[bool] = False
Disable the built-in resource manager, including automatic resource detection and/or assignment of resources to workers.
"resource_info"
will be ignored.
- field dry_run: Optional[bool] = False
Whether libEnsemble should immediately exit after validating all inputs.
- field enforce_worker_core_bounds: Optional[bool] = False
If
False
, the Executor will permit the submission of tasks with a higher processor count than the CPUs available to the worker as detected by the resource manager. Larger node counts are not allowed. When"disable_resource_manager"
isTrue
, this argument is ignored
- field ensemble_copy_back: Optional[bool] = False
Whether to copy back contents of
ensemble_dir_path
to launch location. Useful ifensemble_dir_path
is located on node-local storage.
- field ensemble_dir_path: Optional[Union[str, Path]] = PosixPath('ensemble')
Path to main ensemble directory. Can serve as a single working directory for workers, or contain calculation directories
- field final_gen_send: Optional[bool] = False
Send final simulation results to persistent generators before shutdown. The results will be sent along with the
PERSIS_STOP
tag.
- field gen_dir_copy_files: Optional[List[Union[str, Path]]] = []
Paths to copy into the working directory upon calling the generator function. List of strings or
pathlib.Path
objects
- field gen_dir_symlink_files: Optional[List[Union[str, Path]]] = []
Paths to symlink into the working directory upon calling the generator function. List of strings or
pathlib.Path
objects.
- field gen_dirs_make: Optional[bool] = False
Whether to make generator-specific calculation directories for each generator function call.
- field gen_input_dir: Optional[Union[str, Path]] = None
Copy this directory’s contents into the working directory upon calling the generator function.
- field gen_num_gpus: Optional[int] = None
The default number of GPUs required by generators. Unless overridden by the equivalent persis_info settings, generators will be allocated this many GPUs.
- field gen_num_procs: Optional[int] = None
The default number of processors (MPI ranks) required by generators. Unless overridden by the equivalent persis_info settings, generators will be allocated this many processors for applications launched via the MPIExecutor.
- field ip: Optional[str] = None
TCP Only: IP address for Manager’s system.
- field kill_canceled_sims: Optional[bool] = False
Try to kill sims with
"cancel_requested"
setTrue
. IfFalse
, the manager avoids this moderate overhead.
- field mpi_comm: Optional[Any] = None
libEnsemble MPI communicator. Default:
MPI.COMM_WORLD
- field num_resource_sets: Optional[int] = None
Total number of resource sets. Resources will be divided into this number. If not set, resources will be divided evenly (excluding zero_resource_workers).
- field nworkers: Optional[int] = None
Number of worker processes in
"local"
or"tcp"
.
- field platform: Optional[str] = ''
Name of a known platform defined in the platforms module.
See
Known Platforms List
.Example:
libE_specs["platform"] = "perlmutter_g"
Alternatively set the environment variable
LIBE_PLATFORM
:export LIBE_PLATFORM="perlmutter_g"
See also option
platform_specs
.
- field platform_specs: Optional[Union[Platform, dict]] = {}
A Platform object or dictionary specifying settings for a platform.
To use existing platform:
from libensemble.resources.platforms import PerlmutterGPU libE_specs["platform_specs"] = PerlmutterGPU()
See
Known Platforms List
.Or define a platform:
from libensemble.resources.platforms import Platform libE_specs["platform_specs"] = Platform( mpi_runner="srun", cores_per_node=64, logical_cores_per_node=128, gpus_per_node=8, gpu_setting_type="runner_default", scheduler_match_slots=False, )
For list of Platform fields see
Platform Fields
.Any fields not given will be auto-detected by libEnsemble.
See also option
platform
.
- field port: Optional[int] = 0
TCP Only: Port number for Manager’s system.
- field profile: Optional[bool] = False
Profile manager and worker logic using
cProfile
.
- field resource_info: Optional[dict] = {}
Resource information to override automatically detected resources. Allowed fields are given below in ‘Overriding Resource Auto-detection’. Note that if
disable_resource_manager
is set then this option is ignored.
- field reuse_output_dir: Optional[bool] = False
Whether to allow overwrites and access to previous ensemble and workflow directories in subsequent runs.
False
by default to protect results.
- field safe_mode: Optional[bool] = False
Prevents user functions from overwriting protected History fields, but requires moderate overhead.
- field save_H_and_persis_on_abort: Optional[bool] = True
Save states of
H
andpersis_info
to file on aborting after an exception.
- field save_every_k_gens: Optional[int] = 0
Save history array to file after every k generated points.
- field save_every_k_sims: Optional[int] = 0
Save history array to file after every k evaluated points.
- field scheduler_opts: Optional[dict] = {}
Options for the resource scheduler. See ‘Scheduler Options’ for more info
- field sim_dir_copy_files: Optional[List[Union[str, Path]]] = []
Paths to copy into the working directory upon calling the simulation function. List of strings or
pathlib.Path
objects.
- field sim_dir_symlink_files: Optional[List[Union[str, Path]]] = []
Paths to symlink into the working directory upon calling the simulation function. List of strings or
pathlib.Path
objects.
- field sim_dirs_make: Optional[bool] = False
Whether to make calculation directories for each simulation function call.
- field sim_input_dir: Optional[Union[str, Path]] = None
Copy this directory’s contents into the working directory upon calling the simulation function.
- field stats_fmt: Optional[dict] = {}
Options for formatting
'libE_stats.txt'
. See ‘Formatting libE_stats.txt’.
- field use_persis_return_gen: Optional[bool] = False
Adds persistent generator output fields to the History array on return.
- field use_persis_return_sim: Optional[bool] = False
Adds persistent simulator output fields to the History array on return.
- field use_worker_dirs: Optional[bool] = False
Whether to organize calculation directories under worker-specific directories.
- field use_workflow_dir: Optional[bool] = False
Whether to place all log files, dumped arrays, and default output directories in a separate workflow directory. Each run will be suffixed with a hash. If copying back an ensemble directory from a scratch space, the copy is placed here.
- field workerID: Optional[int] = None
TCP Only: Worker ID number assigned to the new process.
- field worker_cmd: Optional[List[str]] = None
TCP Only: Split string corresponding to worker/client Python process invocation. Contains a local Python path, calling script, and manager/server format-fields for
manager_ip
,manager_port
,authkey
, andworkerID
.nworkers
is specified normally.
- field worker_timeout: Optional[int] = 1
On libEnsemble shutdown, number of seconds after which workers considered timed out, then terminated.
- field workers: Optional[List[str]] = None
TCP Only: A list of worker hostnames.
- field workflow_dir_path: Optional[Union[str, Path]] = '.'
Optional path to the workflow directory.
- field zero_resource_workers: Optional[List[int]] = []
List of workers that require no resources. For when a fixed mapping of workers to resources is required. Otherwise, use
num_resource_sets
. For use with supported allocation functions.
Scheduler Options
See options for built-in scheduler.
Overriding Resource Auto-Detection
Note that "cores_on_node"
and "gpus_on_node"
are supported for backward
compatibility, but use of Platform specification is
recommended for these settings.
Resource Info Fields
The allowable libE_specs["resource_info"]
fields are:
"cores_on_node" [tuple (int, int)]:
Tuple (physical cores, logical cores) on nodes.
"gpus_on_node" [int]:
Number of GPUs on each node.
"node_file" [str]:
Name of file containing a node-list. Default is "node_list".
"nodelist_env_slurm" [str]:
The environment variable giving a node list in Slurm format
(Default: Uses ``SLURM_NODELIST``). Queried only if
a ``node_list`` file is not provided and the resource manager is
enabled.
"nodelist_env_cobalt" [str]:
The environment variable giving a node list in Cobalt format
(Default: Uses ``COBALT_PARTNAME``) Queried only
if a ``node_list`` file is not provided and the resource manager
is enabled.
"nodelist_env_lsf" [str]:
The environment variable giving a node list in LSF format
(Default: Uses ``LSB_HOSTS``) Queried only
if a ``node_list`` file is not provided and the resource manager
is enabled.
"nodelist_env_lsf_shortform" [str]:
The environment variable giving a node list in LSF short-form
format (Default: Uses ``LSB_MCPU_HOSTS``) Queried only
if a ``node_list`` file is not provided and the resource manager is
enabled.
For example:
customizer = {cores_on_node": (16, 64),
"node_file": "libe_nodes"}
libE_specs["resource_info"] = customizer
Formatting Options for libE_stats File
The allowable libE_specs["stats_fmt"]
fields are:
"task_timing" [bool] = ``False``:
Outputs elapsed time for each task launched by the executor.
"task_datetime" [bool] = ``False``:
Outputs the elapsed time and start and end time for each task launched by the executor.
Can be used with the ``"plot_libe_tasks_util_v_time.py"`` to give task utilization plots.
"show_resource_sets" [bool] = ``False``:
Shows the resource set IDs assigned to each worker for each call of the user function.