Perlmutter¶

Perlmutter is an HPE Cray “Shasta” system located at NERSC. Its compute nodes are equipped with four A100 NVIDIA GPUs.

It uses the SLURM scheduler to submit jobs from login nodes to run on the compute nodes.

Configuring Python and Installation¶

Begin by loading the python module. The following modules are recommended:

module load python

Create a conda environment¶

You can create a conda environment in which to install libEnsemble and all dependencies. For example:

conda create -n libe-pm python=3.10 -y

As Perlmutter has a shared HOME filesystem with other clusters, using the -pm suffix (for Perlmutter) is good practice.

Activate your virtual environment with:

export PYTHONNOUSERSITE=1
conda activate libe-pm

Installing libEnsemble and dependencies¶

Having loaded the Anaconda Python module, libEnsemble can be installed by one of the following ways.

Install via pip into the environment.

pip install libensemble

Install via conda:

conda config --add channels conda-forge
conda install -c conda-forge libensemble

See advanced installation for other installation options.

Job Submission¶

Perlmutter uses Slurm for job submission and management. The two most common commands for initiating jobs are salloc and sbatch for running in interactive and batch modes, respectively. libEnsemble runs on the compute nodes on Perlmutter using either multi-processing (recommended) or mpi4py.

While libEnsemble should detect Perlmutter settings, you can ensure this by setting one of the following environment variables in the submission script or interactive session for either the CPU or GPU partitions of Perlmutter:

export LIBE_PLATFORM="perlmutter_c"  # For CPU partition
export LIBE_PLATFORM="perlmutter_g"  # For GPU partition

Example¶

To run the forces_gpu tutorial on Perlmutter.

To obtain the example you can git clone libEnsemble - although only the forces sub-directory is needed:

git clone https://github.com/Libensemble/libensemble
cd libensemble/libensemble/tests/scaling_tests/forces/forces_app

To compile forces:

module load PrgEnv-nvidia cudatoolkit craype-accel-nvidia80
cc -DGPU -O3 -fopenmp -mp=gpu -target-accel=nvidia80 -o forces.x forces.c

Now go to forces_gpu directory:

cd ../forces_gpu

Now grab an interactive session on one node:

salloc -N 1 -t 20 -C gpu -q interactive -A <project_id>

Then in the session run:

export LIBE_PLATFORM="perlmutter_g"
python run_libe_forces.py -n 5

This places the generator on the first worker and runs simulations on the others (each simulation using one GPU).

To see GPU usage, ssh into the node you are on in another window and run:

watch -n 0.1 nvidia-smi

To watch video¶

There is a video demonstration of the forces example on Perlmutter.

Note

The video uses libEnsemble version 0.9.3, where some adjustments of the scripts are needed to run on Perlmutter. These adjustments are no longer necessary. libEnsemble now correctly detects MPI runner and GPU setting on Perlmutter and the GPU code runs with many more particles than the CPU version (forces_simple).

Example submission scripts are also given in the examples.

Running libEnsemble with mpi4py¶

Running libEnsemble with local comms is usually sufficient on Perlmutter. However, if you need to use mpi4py, you should install and run as follows:

module load PrgEnv-gnu cudatoolkit
MPICC="cc -target-accel=nvidia80 -shared" pip install --force --no-cache-dir --no-binary=mpi4py mpi4py

This line will build mpi4py on top of a CUDA-aware Cray MPICH.

To run using 5 workers (one manager):

export SLURM_EXACT=1
srun -n 6 python my_script.py

More information on using Python and mpi4py on Perlmutter can be found in the Python on Perlmutter documentation.

Perlmutter FAQ¶

Some FAQs specific to Perlmutter. See more on the FAQ page.

Additional Information¶

See the NERSC Perlmutter docs for more information about Perlmutter.