Skip to content

Issues to run code examples with Nvidia A100 GPU

OpenLB – Open Source Lattice Boltzmann Code Forums on OpenLB General Topics Issues to run code examples with Nvidia A100 GPU

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #6798
    jflorezgi
    Participant

    hi everyone,

    I have carried out several tests running example codes and my applications with different Nvidia cards without problem using the config.mk suggested by you in config file. Now I have access to two Nvidia A100 cards (Ampere Architecture) and I want to start running my applications on them.
    According to the rules.mk file, there are three versions for this architecture, I’m not sure which one is indicated for my graphics card, I’ve tried all three versions (doing the steps indicated in gpu_only.mk, make clean, make, etc.) and version 80 is the only one that seems to try to compile but in all three cases it appears an error and does not finish compiling correctly, below I show the errors in the three cases:

    Version 86, 87:
    make -C ../../../external
    make[1]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external’
    make -C zlib
    make[2]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external/zlib’
    make[2]: Nothing to be done for ‘all’.
    make[2]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external/zlib’
    cp zlib/build/libz.a lib/
    make -C tinyxml
    make[2]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external/tinyxml’
    make[2]: Nothing to be done for ‘all’.
    make[2]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external/tinyxml’
    cp tinyxml/build/libtinyxml.a lib/
    make[1]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external’
    nvcc -O3 -std=c++17 –generate-code=arch=compute_86,code=[compute_86,sm_86] –extended-lambda –expt-relaxed-constexpr -x cu -Xcudafe “–diag_suppress=implicit_return_from_non_void_function –display_error_number –diag_suppress=20014 –diag_suppress=20011” -DPLATFORM_CPU_SISD -DPLATFORM_GPU_CUDA -I../../../src -I../../../external/zlib -I../../../external/tinyxml -c -o cavity3d.o cavity3d.cpp
    nvcc fatal : Unsupported gpu architecture ‘compute_86’
    make: *** [../../../default.mk:31: cavity3d.o] Error 1.

    Version 80:
    make -C ../../../external
    make[1]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external’
    make -C zlib
    make[2]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external/zlib’
    make[2]: Nothing to be done for ‘all’.
    make[2]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external/zlib’
    cp zlib/build/libz.a lib/
    make -C tinyxml
    make[2]: Entering directory ‘/home/jflorez/proyecto/olb-1.5r0/external/tinyxml’
    make[2]: Nothing to be done for ‘all’.
    make[2]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external/tinyxml’
    cp tinyxml/build/libtinyxml.a lib/
    make[1]: Leaving directory ‘/home/jflorez/proyecto/olb-1.5r0/external’
    nvcc -O3 -std=c++17 –generate-code=arch=compute_80,code=[compute_80,sm_80] –extended-lambda –expt-relaxed-constexpr -x cu -Xcudafe “–diag_suppress=implicit_return_from_non_void_function –display_error_number –diag_suppress=20014 –diag_suppress=20011” -DPLATFORM_CPU_SISD -DPLATFORM_GPU_CUDA -I../../../src -I../../../external/zlib -I../../../external/tinyxml -c -o cavity3d.o cavity3d.cpp
    Command-line error #614: invalid error number in diagnostic control option: 20014

    1 catastrophic error detected in this compilation.
    Compilation terminated.
    make: *** [../../../default.mk:31: cavity3d.o] Error 1.

    The idea is finally to be able to run the programs on both cards using the gpu_openmpi.mk file. I appreciate any help you can give me.

    #6799
    jflorezgi
    Participant

    I’m checking a little more and I need to update some packages, so for now I can’t guarantee problems in the compilation with the graphics card, sorry, I’ll write when I have solved it if the problem persists

    #6915
    Adrian
    Keymaster

    In case this is still an issue: For our A100s we use the CUDA_ARCH := 80. 86 is used for e.g. RTX 30* series GPUs and available starting 11.1 which indicates that the error source in your log was a lower CUDA version. For the CUDA MPI usage I have made the best experiences using the Nvidia HPC SDK which bundles CUDA and a matching MPI library.

Viewing 3 posts - 1 through 3 (of 3 total)
  • You must be logged in to reply to this topic.