Skip to content

Reply To: Multi GPUs Calculation


I noticed when I tyied $ mpirun -np 1 bash -c ‘export CUDA_VISIBLE_DEVICES=${OMPI_COMM_WORLD_LOCAL_RANK}; ./cavity3d’ it was successful;
output is
100, 100, 1, 1, 1, 1515.15

However I still dont use 2 GPUs.
error message is
$ mpirun -np 2 bash -c ‘export CUDA_VISIBLE_DEVICES=${OMPI_COMM_WORLD_LOCAL_RANK}; ./cavity3d’
The call to cuIpcOpenMemHandle failed. This is an unrecoverable error
and will cause the program to abort.
Hostname: YujiShimojima
cuIpcOpenMemHandle return value: 217
address: 0x1310200000
Check the cuda.h file for what the return value means. A possible cause
for this is not enough free device memory. Try to reduce the device
memory footprint of your application.
[YujiShimojima:01335] Failed to register remote memory, rc=-1
[YujiShimojima:01334] Failed to register remote memory, rc=-1
corrupted size vs. prev_size while consolidating
[YujiShimojima:01334] *** Process received signal ***
[YujiShimojima:01334] Signal: Aborted (6)
[YujiShimojima:01334] Signal code: (-6)
[YujiShimojima:01330] 1 more process has sent help message help-mpi-common-cuda.txt / cuIpcOpenMemHandle failed
[YujiShimojima:01330] Set MCA parameter “orte_base_help_aggregate” to 0 to see all help / error messages