Skip to content

Reply To: Multi-GPU usage issue

#6960
achodankar
Participant

Hello Adrian,
I did use the config/gpu_openmpi.mk example config.

I used this SLURM script:

#!/bin/bash
#SBATCH –job-name=run1
#SBATCH –output=run1.out
#SBATCH –mail-type=ALL
#SBATCH –partition=gpu
#SBATCH –nodes=1
#SBATCH –gpus-per-node=a100:8
#SBATCH –mem=50gb
#SBATCH –time=5-00:00:00
#SBATCH –get-user-env

CUDA_VISIBLE_DEVICES_SETTING=(“0” “0” “0,1” “0,1,2” “0,1,2,3” “0,1,2,3,4” “0,1,2,3,4,5” “0,1,2,3,4,5,6” “0,1,2,3,4,5,6,7” “0,1,2,3,4,5,6,7,8” “0” )

srun –mpi=pmix_v3 bash -c export CUDA_VISIBLE_DEVICES=${OMPI_COMM_WORLD_LOCAL_RANK}; ./poiseuille3d

————————————————————————————————-

The mpirun command doesn’t work on the cluster for me. I also tried by setting the no of cuboids to 8. Also, I tried using the cuda_visible device settings. Also, I used this line srun –mpi=pmix_v3 export env CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES_SETTING[$gpus-per-node]}; ./poiseuille3d instead of the previous one. However, none of the cases worked out for me.

I was adviced to use cudasetdevice command with the rank. How should I implement it? Is it the right approach to fix this issue?

I would really appreciate your help to resolve this issue.

Thank you.

Yours sincerely,

Abhijeet C.