Reply To: Multi-GPU usage issue
OpenLB – Open Source Lattice Boltzmann Code › Forums › on OpenLB › General Topics › Multi-GPU usage issue › Reply To: Multi-GPU usage issue
I am starting to suspect that the case was not actually compiled with MPI support (at the start of the output you should see the number of MPI processes printed by OpenLB, is this correct in your SLURM log?)
Replying in more detail to your previous question:
How exactly did you launch the application and how did you assign each process a single GPU?
The steps are the following:
1. Copy the example config config/gpu_openmpi.mk
into config.mk
using e.g. cp config/gpu_openmpi.mk config.mk
2. Edit the config.mk
to use the correct CUDA_ARCH
for your target GPU
3. Ensure that a CUDA-aware MPI module and CUDA 11.4 or later (for nvcc
) is loaded in your build environment
4. Edit the config.mk
to use the mpic++
provided CXXFLAGS
and LDFLAGS
per the config hint
# CXXFLAGS and LDFLAGS may need to be adjusted depending on the specific MPI installation.
# Compare tompicxx --showme:compile
andmpicxx --showme:link
when in doubt.
4. Compile the example using make
5. Update the SLURM script to launch one process per GPU and assign each process a GPU via the CUDA_VISIBLE_DEVICES
environment variable. This is what
mpirun bash -c ‘export CUDA_VISIBLE_DEVICES=${OMPI_COMM_WORLD_LOCAL_RANK}; ./program
does.
For further investigation on where the problem is in your process it would help if you can share your exact config.mk
, SLURM script and job output in addition to more information of your system setup.
Other approaches are possible depending on the exact environment.
- This reply was modified 2 years ago by Adrian.