Legacy Balsam MPI Executor
This module launches and controls the running of tasks with Balsam versions up to 0.5.0. Balsam is especially useful when running libEnsemble on three-tier systems with intermediate launch nodes. Typically on such systems, MPI processes are themselves unable to submit further MPI tasks to the batch scheduler. Therefore when libEnsemble’s workers have been launched in a distributed fashion via MPI, they must communicate with an intermediate service like Balsam running on the launch nodes. The Balsam service then reserves compute resources and launches tasks from libEnsemble’s workers that are using the Balsam MPI Executor.
In order to create a Balsam executor, the calling script should contain
exctr = LegacyBalsamMPIExecutor()
The Balsam executor inherits from the MPI executor. See the MPIExecutor for shared API. Any differences are shown below.
- class legacy_balsam_executor.LegacyBalsamMPIExecutor(custom_info={})
Bases:
MPIExecutor
Inherits from MPIExecutor and wraps the Balsam task management service
Note
Task kills are not configurable in the Balsam executor.
- serial_setup()
Balsam serial setup includes empyting database and adding applications
- static del_apps()
Deletes all Balsam apps in the libe_app namespace
- static del_tasks()
Deletes all Balsam tasks
- static add_app(name, exepath, desc)
Add application to Balsam database
- submit(calc_type=None, app_name=None, num_procs=None, num_nodes=None, procs_per_node=None, machinefile=None, app_args=None, stdout=None, stderr=None, stage_inout=None, hyperthreads=False, dry_run=False, wait_on_start=False, extra_args='')
Creates a new task, and either executes or schedules to execute in the executor
The created task object is returned.
- default_app(calc_type)
Gets the default app for a given calc type
- property gen_default_app
Returns the default generator app
- get_app(app_name)
Gets the app for a given app_name or raise exception
- get_task(taskid)
Returns the task object for the supplied task ID
- kill(task)
Kills a task
- manager_poll()
Polls for a manager signal
The executor manager_signal attribute will be updated.
- poll(task)
Polls a task
- polling_loop(task, timeout=None, delay=0.1, poll_manager=False)
Optional, blocking, generic task status polling loop. Operates until the task finishes, times out, or is optionally killed via a manager signal. On completion, returns a presumptive calc_status integer. Potentially useful for running an application via the Executor until it stops without monitoring its intermediate output.
- Parameters
task (object) – a Task object returned by the executor on submission
timeout (int, optional) – Maximum number of seconds for the polling loop to run. Tasks that run longer than this limit are killed. Default: No timeout
delay (int, optional) – Sleep duration between polling loop iterations. Default: 0.1 seconds
poll_manager (bool, optional) – Whether to also poll the manager for ‘finish’ or ‘kill’ signals. If detected, the task is killed. Default: False.
- Returns
calc_status – presumptive integer attribute describing the final status of a launched task
- Return type
int
- register_app(full_path, app_name=None, calc_type=None, desc=None)
Registers a user application to libEnsemble.
The
full_path
of the application must be supplied. Eitherapp_name
orcalc_type
can be used to identify the application in user scripts (in the submit function).app_name
is recommended.- Parameters
full_path (String) – The full path of the user application to be registered
app_name (String, optional) – Name to identify this application.
calc_type (String, optional) – Calculation type: Set this application as the default ‘sim’ or ‘gen’ function.
desc (String, optional) – Description of this application
- set_workerID(workerid)
Sets the worker ID for this executor
- set_worker_info(comm, workerid=None)
Sets info for this executor
- property sim_default_app
Returns the default simulation app
- class legacy_balsam_executor.LegacyBalsamTask(app=None, app_args=None, workdir=None, stdout=None, stderr=None, workerid=None)
Bases:
Task
Wraps a Balsam Task from the Balsam service
The same attributes and query routines are implemented.
- read_file_in_workdir(filename)
Opens and reads the named file in the task’s workdir
- read_stdout()
Opens and reads the task’s stdout file in the task’s workdir
- read_stderr()
Opens and reads the task’s stderr file in the task’s workdir
- calc_task_timing()
Calculate timing information for this task
- poll()
Polls and updates the status attributes of the supplied task
- wait(timeout=None)
Waits on completion of the task or raises TimeoutExpired exception
Status attributes of task are updated on completion.
- Parameters
timeout – Time in seconds after which a TimeoutExpired exception is raised
- kill(wait_time=None)
Kills or cancels the supplied task
- cancel()
Wrapper for task.kill() without waiting
- cancelled()
Return
`True
if task successfully cancelled.
- done()
Return
`True
if task is finished.
- exception(timeout=None)
Wrapper for task.wait() that instead returns the task’s error code on completion.
- Parameters
timeout – Time in seconds after which a TimeoutExpired exception is raised
- file_exists_in_workdir(filename)
Returns true if the named file exists in the task’s workdir
- result(timeout=None)
Wrapper for task.wait() that also returns the task’s status on completion.
- Parameters
timeout – Time in seconds after which a TimeoutExpired exception is raised
- running()
Return
`True
if task is currently running.
- stderr_exists()
Returns true if the task’s stderr file exists in the workdir
- stdout_exists()
Returns true if the task’s stdout file exists in the workdir
- workdir_exists()
Returns true if the task’s workdir exists