Resource Detection

The resource manager can detect system resources, and partition these to workers. The MPI Executor accesses the resources available to the current worker when launching tasks.

Node-lists are detected by an environment variable on the following systems:

Scheduler

Nodelist Env. variable

SLURM

SLURM_NODELIST

COBALT

COBALT_PARTNAME

LSF

LSB_HOSTS/LSB_MCPU_HOSTS

These environment variable names can be modified via the resource_info libE_specs option.

On other systems you may have to supply a node list in a file called node_list in your run directory. For example, on ALCF system Cooley, the session node list can be obtained as follows:

cat $COBALT_NODEFILE > node_list

Resource detection can be disabled by setting libE_specs['disable_resource_manager'] = True, and users can simply supply run configuration options on the Executor submit line.

This will usually work sufficiently on systems that have application-level scheduling and queuing (e.g., jsrun on Summit). However, on many cluster and multi-node systems, if the built-in resource manager is disabled, then runs without a hostlist or machinefile supplied may be undesirably scheduled to the same nodes.

System detection for resources can be overridden using the resource_info libE_specs option.