Multiple GPU compiler error: cannot find -lmpi_cxx: No such file or director • OpenLB

This topic has 5 replies, 3 voices, and was last updated 1 month ago by thanhphatvt.

Viewing 6 posts - 1 through 6 (of 6 total)

Author

Posts
June 3, 2024 at 1:01 pm #8756

aseidler
Participant

Hi,
i am trying to implement a multi-gpu simulation on the HPC in dresden. I know that i some how have to modify my flags but i have no clue how to do this. I implemented the gpu_mixed_only like this:

######################### GPU Only Calculation #######################
CXX := nvcc
CC := nvcc

CXXFLAGS := -O3
CXXFLAGS += -std=c++17# –forward-unknown-to-host-compiler
CXXFLAGS += -Xcompiler -I/software/rome/r23.10/OpenMPI/4.1.4-GCC-11.3.0/include

#Single GPU
#PARALLEL_MODE := NONE

#Parallel GPU
PARALLEL_MODE := MPI

MPIFLAGS := -lmpi_cxx -lmpi

PLATFORMS := CPU_SISD GPU_CUDA

# for e.g. RTX 30* (Ampere), see table in rules.mk for other options
CUDA_ARCH :=80

FLOATING_POINT_TYPE := float

USE_EMBEDDED_DEPENDENCIES := ON

–alex

June 3, 2024 at 1:19 pm #8757

Adrian
Keymaster

If you want to use nvcc directly for the entire compilation you quite likely will need to manually prescribe all MPI flags. Those are usually set by the mpicxx wrapper, you can get them for your environment using mpicxx --showme:compile resp. --showme:link. This is what is suggested in the config/gpu_openmpi.mk config (which I assume you started with?)

I would recommend to use the config/gpu_openmpi_mixed.mk as a starting point. You can compare config/gpu_horeka_nvidiahpc_mixed.mk to see how this looks like for a real world cluster.

I assume you already have selected the CUDA and MPI modules (with CUDA-awareness compiled in) for your cluster? If so I should be able to guide you to a working config given the output of mpicxx --showme and (optionally) the path to the CUDA module.

You may also be lucky and be able to use config/gpu_openmpi_mixed.mk basically unmodified if the cluster modules are set up well. This recently happened to me on the Karolina cluster at IT4I (unfortunately this is also the only cluster I encountered where all required modules were set up fully, removing the need for any manual flag twiddling).

June 3, 2024 at 1:28 pm #8758

aseidler
Participant

Hi Adiran,

I guess i managed to solve the problem on my own.
I had some problems with “nvcc fatal : Unknown option ‘-Wl,-rpath'” but just deleting the path und changing the MPIFLAGS to

MPIFLAGS := -L/software/rome/r23.10/OpenMPI/4.1.4-GCC-11.3.0/lib -L/software/rome/r23.10/hwloc/2.7.1-GCCcore-11.3.0/lib -L/software/rome/r23.10/libevent/2.1.12-GCCcore-11.3.0/lib -lmpi

solved the issue for now.

Thanks a lot

-Alex

June 25, 2024 at 3:43 pm #8855

thanhphatvt
Participant

Hi aseidler,
I have the same problem with you and could I ask how can you get your MPIFLAGS for your simulation?
Thanks

June 25, 2024 at 5:27 pm #8856

aseidler
Participant

Hi thanhphatvt,

the following helped for me:

i changed the “lib” in the Makefile in olb.1.7.r0/externals to this:

all: lib zlib tinyxml
.PHONY: zlib tinyxml

lib:
mkdir -p lib

zlib:
make -C zlib
cp zlib/build/libz.a lib/

clean_zlib:
make -C zlib clean

tinyxml:
make -C tinyxml
cp tinyxml/build/libtinyxml.a lib/

clean_tinyxml:
make -C tinyxml clean

clean: clean_zlib clean_tinyxml
rm -f lib/libz.a lib/libtinyxml.a

and than this is my final config.mk for olb.1.7r0:

CXX := nvcc -ccbin=mpicxx
CC := nvcc -ccbin=mpicc

CXXFLAGS := -O3
CXXFLAGS += -std=c++17

PARALLEL_MODE := MPI

PLATFORMS := CPU_SISD GPU_CUDA

CUDA_ARCH := 80

FLOATING_POINT_TYPE := float

USE_EMBEDDED_DEPENDENCIES := ON

Furthermore i got the problem on our HPC that OpenMPI is not working when just loading the module so here are my modules I am using for HPC in Dresden:

ml release/23.04 GCC/11.3.0 OpenMPI/4.1.4 CUDA/11.7 UCX-CUDA

I hope that this answer is helpful for you.

-Alex

June 26, 2024 at 6:28 am #8860

thanhphatvt
Participant

Hi Alex,
Thank you so much!
Let me try it.
Author

Posts

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.