Issues to run code examples with Nvidia A100 GPU
OpenLB – Open Source Lattice Boltzmann Code › Forums › on OpenLB › General Topics › Issues to run code examples with Nvidia A100 GPU
- This topic has 2 replies, 2 voices, and was last updated 11 months ago by Adrian.
-
AuthorPosts
-
September 13, 2022 at 4:21 pm #6798jflorezgiParticipant
hi everyone,
I have carried out several tests running example codes and my applications with different Nvidia cards without problem using the config.mk suggested by you in config file. Now I have access to two Nvidia A100 cards (Ampere Architecture) and I want to start running my applications on them.
According to the rules.mk file, there are three versions for this architecture, I’m not sure which one is indicated for my graphics card, I’ve tried all three versions (doing the steps indicated in gpu_only.mk, make clean, make, etc.) and version 80 is the only one that seems to try to compile but in all three cases it appears an error and does not finish compiling correctly, below I show the errors in the three cases:Version 86, 87:
make -C ../../../external
make[1]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external’
make -C zlib
make[2]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external/zlib’
make[2]: Nothing to be done for ‘all’.
make[2]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external/zlib’
cp zlib/build/libz.a lib/
make -C tinyxml
make[2]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external/tinyxml’
make[2]: Nothing to be done for ‘all’.
make[2]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external/tinyxml’
cp tinyxml/build/libtinyxml.a lib/
make[1]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external’
nvcc -O3 -std=c++17 –generate-code=arch=compute_86,code=[compute_86,sm_86] –extended-lambda –expt-relaxed-constexpr -x cu -Xcudafe “–diag_suppress=implicit_return_from_non_void_function –display_error_number –diag_suppress=20014 –diag_suppress=20011” -DPLATFORM_CPU_SISD -DPLATFORM_GPU_CUDA -I../../../src -I../../../external/zlib -I../../../external/tinyxml -c -o cavity3d.o cavity3d.cpp
nvcc fatal : Unsupported gpu architecture ‘compute_86’
make: *** [../../../default.mk:31: cavity3d.o] Error 1.Version 80:
make -C ../../../external
make[1]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external’
make -C zlib
make[2]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external/zlib’
make[2]: Nothing to be done for ‘all’.
make[2]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external/zlib’
cp zlib/build/libz.a lib/
make -C tinyxml
make[2]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external/tinyxml’
make[2]: Nothing to be done for ‘all’.
make[2]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external/tinyxml’
cp tinyxml/build/libtinyxml.a lib/
make[1]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external’
nvcc -O3 -std=c++17 –generate-code=arch=compute_80,code=[compute_80,sm_80] –extended-lambda –expt-relaxed-constexpr -x cu -Xcudafe “–diag_suppress=implicit_return_from_non_void_function –display_error_number –diag_suppress=20014 –diag_suppress=20011” -DPLATFORM_CPU_SISD -DPLATFORM_GPU_CUDA -I../../../src -I../../../external/zlib -I../../../external/tinyxml -c -o cavity3d.o cavity3d.cpp
Command-line error #614: invalid error number in diagnostic control option: 200141 catastrophic error detected in this compilation.
Compilation terminated.
make: *** [../../../default.mk:31: cavity3d.o] Error 1.The idea is finally to be able to run the programs on both cards using the gpu_openmpi.mk file. I appreciate any help you can give me.
September 13, 2022 at 5:26 pm #6799jflorezgiParticipantI’m checking a little more and I need to update some packages, so for now I can’t guarantee problems in the compilation with the graphics card, sorry, I’ll write when I have solved it if the problem persists
October 27, 2022 at 11:30 am #6915AdrianKeymasterIn case this is still an issue: For our A100s we use the
CUDA_ARCH := 80
. 86 is used for e.g. RTX 30* series GPUs and available starting 11.1 which indicates that the error source in your log was a lower CUDA version. For the CUDA MPI usage I have made the best experiences using the Nvidia HPC SDK which bundles CUDA and a matching MPI library. -
AuthorPosts
- You must be logged in to reply to this topic.