Skip to content

Reply To: Multi-GPU MPI library is not CUDA-aware

Due to recent bot attacks we have chanced the sign-up process. If you want to participate in our forum, first register on this website and then send a message via our contact form.

Forums on OpenLB General Topics Multi-GPU MPI library is not CUDA-aware Reply To: Multi-GPU MPI library is not CUDA-aware

#10768
alex.ws
Participant

Hi Adrian,

No it fails with a segmentation fault. Full output below:

ubuntu@hpc:~/OpenLB_GPU/examples/turbulence/aorta3d$ mpirun -np 2 bash -c 'export CUDA_VISIBLE_DEVICES=${OMPI_COMM_WORLD_LOCAL_RANK}; ./aorta3d'
[MpiManager] Sucessfully initialized, numThreads=2
[ThreadPool] Sucessfully initialized, numThreads=1
[GPU_CUDA] The used MPI Library is not CUDA-aware. Multi-GPU execution will fail.
[UnitConverter] ----------------- UnitConverter information -----------------
[UnitConverter] -- Parameters:
[UnitConverter] Resolution:                       N=              40
[UnitConverter] Lattice velocity:                 latticeU=       0.0225
[UnitConverter] Lattice relaxation frequency:     omega=          1.99697
[UnitConverter] Lattice relaxation time:          tau=            0.50076
[UnitConverter] Characteristical length(m):       charL=          0.02246
[UnitConverter] Characteristical speed(m/s):      charU=          0.45
[UnitConverter] Phys. kinematic viscosity(m^2/s): charNu=         2.8436e-06
[UnitConverter] Phys. density(kg/m^d):            charRho=        1055
[UnitConverter] Characteristical pressure(N/m^2): charPressure=   0
[UnitConverter] Mach number:                      machNumber=     0.0389711
[UnitConverter] Reynolds number:                  reynoldsNumber= 3554.29
[UnitConverter] Knudsen number:                   knudsenNumber=  1.09645e-05
[UnitConverter] Characteristical CFL number:      charCFLnumber=  0.0225
[UnitConverter] 
[UnitConverter] -- Conversion factors:
[UnitConverter] Voxel length(m):                  physDeltaX=     0.0005615
[UnitConverter] Time step(s):                     physDeltaT=     2.8075e-05
[UnitConverter] Velocity factor(m/s):             physVelocity=   20
[UnitConverter] Density factor(kg/m^3):           physDensity=    1055
[UnitConverter] Mass factor(kg):                  physMass=       1.86768e-07
[UnitConverter] Viscosity factor(m^2/s):          physViscosity=  0.01123
[UnitConverter] Force factor(N):                  physForce=      0.133049
[UnitConverter] Pressure factor(N/m^2):           physPressure=   422000
[UnitConverter] -------------------------------------------------------------
[UnitConverter] WARNING:
[UnitConverter] Potentially UNSTABLE combination of relaxation time (tau=0.50076)
[UnitConverter] and characteristical CFL number (lattice velocity) charCFLnumber=0.0225!
[UnitConverter] Potentially maximum characteristical CFL number (maxCharCFLnumber=0.00607729)
[UnitConverter] Actual characteristical CFL number (charCFLnumber=0.0225) > 0.00607729
[UnitConverter] Please reduce the the cell size or the time step size!
[UnitConverter] We recommend to use the cell size of 0.000151659 m and the time step size of 7.58293e-06 s.
[UnitConverter] -------------------------------------------------------------
[STLreader] Voxelizing ...
[STLmesh] nTriangles=2654; maxDist2=0.000610779
[STLmesh] minPhysR(StlMesh)=(0.199901,0.0900099,0.0117236); maxPhysR(StlMesh)=(0.243584,0.249987,0.0398131)
[Octree] radius=0.143744; center=(0.221602,0.169858,0.025628)
[STLreader] voxelSize=0.0005615; stlSize=0.001
[STLreader] minPhysR(VoxelMesh)=(0.199984,0.0904055,0.0118712); maxPhysR(VoxelMesh)=(0.24322,0.249873,0.0393848)
[STLreader] Voxelizing ... OK
[prepareGeometry] Prepare Geometry ...
[SuperGeometry3D] cleaned 0 outer boundary voxel(s)
[SuperGeometry3D] cleaned 0 outer boundary voxel(s)
[SuperGeometry3D] cleaned 0 inner boundary voxel(s) of Type 3
[SuperGeometryStatistics3D] updated
[SuperGeometry3D] the model is correct!
[CuboidDecomposition] ---Cuboid Structure Statistics---
[CuboidDecomposition]  Number of Cuboids: 	16
[CuboidDecomposition]  Delta       : 		0.0005615
[CuboidDecomposition]  Ratio  (min): 		0.529412
[CuboidDecomposition]         (max): 		1.77778
[CuboidDecomposition]  Nodes  (min): 		16704
[CuboidDecomposition]         (max): 		35640
[CuboidDecomposition]  Weight (min): 		10726
[CuboidDecomposition]         (max): 		20749
[CuboidDecomposition] --------------------------------
[SuperGeometryStatistics3D] materialNumber=0; count=160731; minPhysR=(0.199984,0.089844,0.0113097); maxPhysR=(0.243781,0.250433,0.0399462)
[SuperGeometryStatistics3D] materialNumber=1; count=171226; minPhysR=(0.200546,0.0904055,0.0118712); maxPhysR=(0.24322,0.249872,0.0393847)
[SuperGeometryStatistics3D] materialNumber=2; count=41080; minPhysR=(0.199984,0.089844,0.0113097); maxPhysR=(0.243781,0.250433,0.0399462)
[SuperGeometryStatistics3D] materialNumber=3; count=1059; minPhysR=(0.208407,0.250433,0.0124327); maxPhysR=(0.228059,0.250433,0.0332082)
[SuperGeometryStatistics3D] materialNumber=4; count=245; minPhysR=(0.200546,0.089844,0.0298392); maxPhysR=(0.210653,0.089844,0.0388232)
[SuperGeometryStatistics3D] materialNumber=5; count=239; minPhysR=(0.234236,0.089844,0.0287162); maxPhysR=(0.24322,0.089844,0.0388232)
[SuperGeometryStatistics3D] countTotal[1e6]=0.37458
[prepareGeometry] Prepare Geometry ... OK
[prepareLattice] Prepare Lattice ...
[prepareLattice] Prepare Lattice ... OK
[Timer] 
[Timer] ----------------Summary:Timer----------------
[Timer] measured time (rt) : 0.295s
[Timer] measured time (cpu): 0.295s
[Timer] ---------------------------------------------
[main] starting simulation...
[hpc:08008] *** Process received signal ***
[hpc:08008] Signal: Segmentation fault (11)
[hpc:08008] Signal code: Invalid permissions (2)
[hpc:08008] Failing at address: 0x318d27e00
[hpc:08009] *** Process received signal ***
[hpc:08009] Signal: Segmentation fault (11)
[hpc:08009] Signal code: Invalid permissions (2)
[hpc:08009] Failing at address: 0x318d49200
[hpc:08009] [ 0] [hpc:08008] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x45330)[0x71e29ba45330]
[hpc:08008] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x45330)[0x74362a445330]
[hpc:08009] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x1a440d)[0x71e29bba440d]
[hpc:08008] [ 2] /lib/x86_64-linux-gnu/libc.so.6(+0x1a440d)[0x74362a5a440d]
[hpc:08009] [ 2] /opt/openmpi-5.0.8/lib/libopen-pal.so.80(+0xcf985)[0x71e2a23b9985]
[hpc:08008] [ 3] /opt/openmpi-5.0.8/lib/libopen-pal.so.80(+0xcf985)[0x74362afb9985]
[hpc:08009] [ 3] /opt/openmpi-5.0.8/lib/libmpi.so.40(mca_pml_ob1_send_request_schedule_once+0x24a)[0x74363105601a]
[hpc:08009] [ 4] /opt/openmpi-5.0.8/lib/libmpi.so.40(mca_pml_ob1_send_request_schedule_once+0x24a)[0x71e2a265601a]
[hpc:08008] [ 4] /opt/openmpi-5.0.8/lib/libmpi.so.40(mca_pml_ob1_recv_frag_callback_ack+0x151)[0x71e2a264d681]
[hpc:08008] [ 5] /opt/openmpi-5.0.8/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x9b)[0x71e2a23bacab]
[hpc:08008] [ 6] /opt/openmpi-5.0.8/lib/libopen-pal.so.80(+0xd118b)[0x71e2a23bb18b]
[hpc:08008] [ 7] /opt/openmpi-5.0.8/lib/libopen-pal.so.80(opal_progress+0x34)[0x71e2a230ec84]
[hpc:08008] [ 8] /opt/openmpi-5.0.8/lib/libmpi.so.40(mca_pml_ob1_recv_frag_callback_ack+0x151)[0x74363104d681]
[hpc:08009] [ 5] /opt/openmpi-5.0.8/lib/libopen-pal.so.80(mca_btl_sm_poll_handle_frag+0x9b)[0x74362afbacab]
[hpc:08009] [ 6] /opt/openmpi-5.0.8/lib/libopen-pal.so.80(+0xd118b)[0x74362afbb18b]
[hpc:08009] [ 7] /opt/openmpi-5.0.8/lib/libopen-pal.so.80(opal_progress+0x34)[0x74362af0ec84]
[hpc:08009] [ 8] /opt/openmpi-5.0.8/lib/libmpi.so.40(ompi_request_default_test+0x51)[0x71e2a2490ae1]
[hpc:08008] [ 9] /opt/openmpi-5.0.8/lib/libmpi.so.40(PMPI_Test+0x4a)[0x71e2a24d72aa]
[hpc:08008] [10] /opt/openmpi-5.0.8/lib/libmpi.so.40(ompi_request_default_test+0x51)[0x743630e90ae1]
[hpc:08009] [ 9] ./aorta3d(+0x1338ca)[0x5abd5e10d8ca]
[hpc:08008] [11] ./aorta3d(+0xc31c2)[0x5abd5e09d1c2]
[hpc:08008] [12] ./aorta3d(+0x1adb5e)[0x5abd5e187b5e]
[hpc:08008] [13] ./aorta3d(+0x33866)[0x5abd5e00d866]
[hpc:08008] [14] /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x71e29ba2a1ca]
[hpc:08008] [15] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x71e29ba2a28b]
[hpc:08008] [16] ./aorta3d(+0x35905)[0x5abd5e00f905]
[hpc:08008] *** End of error message ***
/opt/openmpi-5.0.8/lib/libmpi.so.40(PMPI_Test+0x4a)[0x743630ed72aa]
[hpc:08009] [10] ./aorta3d(+0x1338ca)[0x625743dc38ca]
[hpc:08009] [11] ./aorta3d(+0xc31c2)[0x625743d531c2]
[hpc:08009] [12] ./aorta3d(+0x1adb5e)[0x625743e3db5e]
[hpc:08009] [13] ./aorta3d(+0x33866)[0x625743cc3866]
[hpc:08009] [14] /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x74362a42a1ca]
[hpc:08009] [15] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x74362a42a28b]
[hpc:08009] [16] ./aorta3d(+0x35905)[0x625743cc5905]
[hpc:08009] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 1 with PID 8009 on node hpc exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------