Skip to content

Mosemb

Forum Replies Created

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • in reply to: GPU Examples #6596
    Mosemb
    Participant

    Hey Adrian thanks. So i was able to change the config.mk file like below
    CXX := nvcc
    CC := nvcc
    CXXFLAGS := -O3
    CXXFLAGS += -std=c++17
    PARALLEL_MODE := MPI
    MPIFLAGS := -lmpi_cxx -lmpi
    PLATFORMS := CPU_SISD GPU_CUDA
    # for e.g. RTX 30* (Ampere), see table in rules.mk for other options
    CUDA_ARCH := 80
    USE_EMBEDDED_DEPENDENCIES := ON

    The idea is to run the application with cuda in pararrel. But as i run the application with
    mpirun -np 2 bash -c ‘export CUDA_VISIBLE_DEVICES=${OMPI_COMM_WORLD_LOCAL_RANK}; ./bstep2d’ . The output i get seems not to be parallized, am using 2 nodes at the point and here is the output below

    [Timer]
    [Timer] —————-Summary:Timer—————-
    [Timer] measured time (rt) : 649.823s
    [Timer] measured time (cpu): 634.863s
    [Timer] average MLUPs : 181.855
    [Timer] average MLUPps: 181.855
    [Timer] ———————————————
    [SuperPlaneIntegralFluxVelocity2D] regionSize[m]=0.00468; flowRate[m^2/s]=0.00485398; meanVelocity[m/s]=1.03718
    [SuperPlaneIntegralFluxPressure2D] regionSize[m]=0.00468; force[N]=0.0182643; meanPressure[Pa]=3.90263
    [Timer] step=576920; percent=99.9995; passedTime=653.267; remTime=0.00339701; MLUPs=186.57
    [LatticeStatistics] step=576920; t=1.99999; uMax=0.0301884; avEnergy=9.06721e-05; avRho=1.00147
    [Timer]
    [Timer] —————-Summary:Timer—————-
    [Timer] measured time (rt) : 653.461s
    [Timer] measured time (cpu): 639.569s
    [Timer] average MLUPs : 180.842
    [Timer] average MLUPps: 180.842
    [Timer] ———————————————

    My expectation would be having one value in terms of MLUPs and MLUPps. But i get 2 values for every individual node. How can i fix this?

    in reply to: GPU Examples #6592
    Mosemb
    Participant

    I have used the same configurations as stated in the file but still get the same error

    terminate called after throwing an instance of ‘std::runtime_error’
    what(): invalid device symbol
    Aborted (core dumped)

    in reply to: GPU Examples #6591
    Mosemb
    Participant

    Thanks Adrian. Right now am using 4 NVIDIA A100-SXM4-40GB. But i am going to be running everything on a cluster so am testing things out first on one node then after go full cluster.

Viewing 3 posts - 1 through 3 (of 3 total)