Reply To: Multi GPUs Calculation
gpu::cuda::device::synchronize is not a MPI function (wrapper). It only synchronizes the default CUDA execution stream to the CPU one.
You may still be correct that this is not needed anymore in the current version of the code but this is separate from MPI concerns. The execution is synchronized implicitly on any processing context switch in any case.