Cori

Cori is a Cray XC40 located at NERSC, featuring both Intel Haswell and Knights Landing compute nodes. It uses the SLURM scheduler to submit jobs from login nodes to run on the compute nodes.

Cori does not allow more than one MPI application per compute node.

Configuring Python and Installation

Begin by loading the Python 3 Anaconda module:

module load python

Create a conda environment

You can create a conda environment in which to install libEnsemble and all dependencies. If using mpi4py, it is recommended that you clone the lazy-mpi4py environment provided by NERSC:

conda create --name my_env --clone lazy-mpi4py

If you wish to build mpi4py, it will need to be done using the specific Python instructions from NERSC.

Installing libEnsemble

Having loaded the Anaconda Python module, libEnsemble can be installed by one of the following ways.

  1. Install via pip into the environment.

(my_env) user@cori07:~$ pip install libensemble
  1. Install via conda:

(my_env) user@cori07:~$ conda config --add channels conda-forge
(my_env) user@cori07:~$ conda install -c conda-forge libensemble

It is preferable to create your conda environment under the /global/common file system, which performs best for imported Python packages. This can be done by modifying your ~/.condarc file. For example, add the lines:

envs_dirs:
  - /path/to/my/conda_envs
env_prompt: ({name})

The env_prompt line ensures the whole directory path is not prepended to your prompt (The ({name}) here is literal, do not substitute).

See here for more information on advanced options for installing libEnsemble.

Job Submission

Cori uses Slurm for job submission and management. The two commands you’ll likely use the most to initiate jobs are salloc and sbatch for running interactively and batch, respectively. libEnsemble runs on the compute nodes on Cori using either multi-processing or mpi4py. We recommend reading the Python instructions from NERSC for specific guidance on using both multiprocessing``(used by local mode in libEnsemble) and ``mpi4py.

Note

While it is possible to submit jobs from the user $HOME file system, this is likely to perform very poorly, especially for large ensembles. Users should preferably submit their calling script from the user $SCRATCH (/global/cscratch1/sd/<YourUserName>) directory (fastest but regularly purged) or the project directory (/project/projectdirs/<project_name>/). You cannot run and create output under the /global/common/ file system as this is read-only from compute nodes, but any imported codes (including libEnsemble and gen/sim functions) are best imported from there, especially when running at scale.

Interactive Runs

You can allocate four Knights Landing nodes for thirty minutes through the following:

salloc -N 4 -C knl -q interactive -t 00:30:00

Ensure that the Python 3 Anaconda module module is loaded. If you have installed libEnsemble under the common file system, ensure PYTHONPATH is set (as above).

With your nodes allocated, queue your job to start with four MPI ranks:

srun --ntasks 4 --nodes=1 python calling.py

This line launches libEnsemble with a manager and three workers to one allocated compute node, with three nodes available for the workers to launch user applications (via the Executor or a direct run command such as mpiexec).

This is an example of running in centralized mode; if using the Executor, it should be initiated with central_mode=True. libEnsemble must be run in central mode on Cori because jobs cannot share nodes.

Batch Runs

Batch scripts specify run settings using #SBATCH statements. A simple example for a libEnsemble use case running in centralized MPI mode on KNL nodes resembles the following (add PYTHONPATH lines if necessary):

 1#!/bin/bash
 2#SBATCH -J myjob
 3#SBATCH -N 5
 4#SBATCH -q debug
 5#SBATCH -A myproject
 6#SBATCH -o myjob.out
 7#SBATCH -e myjob.error
 8#SBATCH -t 00:15:00
 9#SBATCH -C knl
10
11module load python/3.7-anaconda-2019.07
12export I_MPI_FABRICS=shm:ofi  # Recommend OFI
13
14# Run libEnsemble (manager and 4 workers) on one node
15# leaving 4 nodes for worker launched applications.
16srun --ntasks 5 --nodes=1 python calling_script.py

With this saved as myscript.sh, allocating, configuring, and running libEnsemble on Cori is achieved by running

sbatch myscript.sh

If you wish to run in multiprocessing (local) mode instead of using mpi4py and if your calling script uses the parse_args() function, then the run line in the above script would be:

python calling_script.py --comms local --nworkers 4

As a larger example, the following script would launch libEnsemble in MPI mode with one manager and 128 workers, where each worker will have two nodes for the user application. libEnsemble could be run on more than one node, but here the overcommit option to srun is used on one node.

 1#!/bin/bash
 2#SBATCH -J my_bigjob
 3#SBATCH -N 257
 4#SBATCH -q regular
 5#SBATCH -A myproject
 6#SBATCH -o myjob.out
 7#SBATCH -e myjob.error
 8#SBATCH -t 01:00:00
 9#SBATCH -C knl
10
11module load python/3.7-anaconda-2019.07
12export I_MPI_FABRICS=shm:ofi  # Recommend OFI
13
14# Run libEnsemble (manager and 128 workers) on one node
15# leaving 256 nodes for worker launched applications.
16srun --overcommit --ntasks 129 --nodes=1 python calling_script.py

Example submission scripts are also given in the examples.

Cori FAQ

Error in `<PATH>/bin/python’: break adjusted to free malloc space: 0x0000010000000000

This error has been encountered on Cori when running with an incorrect installation of mpi4py. See instructions above.

Additional Information

See the NERSC Cori docs here for more information about Cori.