The developer team is very happy to announce the release of the next version of OpenLB. The updated open-source Lattice Boltzmann (LB) code is now available for download.
Major new features include support for GPUs using CUDA, vectorized collision steps on SIMD CPUs, a new implementation of our resolved particle system as well as the possibility of simulating free surface flows and reactions.
Core changes and features:
- Support for GPUs using CUDA
- Support for SIMD collision steps (AVX2 / AVX-512)
- New Dynamics concept (including Momenta)
- New PostProcessor concept
- New resolved particle system implementation
- New automatic code generation of CSE-optimized functions
New physical models:
- Free surface flows
- Solver class for structuring simulations
- Forward Algorithmic Differentiation
- Finite difference methods (FDM) and LBM for advection diffusion (reaction) equations (ADRE)
- Free surface flows:
- New free energy examples:
- New ADRE examples based on FDM and LBM
Compatibility tested on:
- Various Linux distributions
- NixOS 21.11
- Ubuntu 20.04.4 LTS
- Red Hat Enterprise Linux 8.2
- Windows WSL 1 and 2
- Mac OS 11.6
- Various Linux distributions
- GCC 9, 10, 11
- Clang 13
- Intel C++ 19, 2021.4
- Nvidia CUDA 11.4
- Nvidia HPC SDK 21.3
- OpenMPI 3.1, 4.1
- Intel MPI 2021.3.0
Early benchmarks confirm good GPU utilization for the established lid driven cavity benchmark with local velocity boundaries including non local edge and corner treatment. For example a single precision 1000^3 cavity is simulated at a cell throughput of 42.2 GLUPs on two HoreKa GPU nodes featuring four Nvidia A100 accelerators each, compared to 24.8 GLUPs on a single node. This yields a strong parallel efficiency of 85%.
The same benchmark on two CPU-only nodes utilizing AVX-512 and hybrid parallelization yields a performance of 2.7 GLUPs, leading to a speedup of 15.6 for the GPU code.
Other cases such as a turbulent nozzle flow with non-local interpolated boundaries and Smagorinsky BGK LES also perform well. E.g. for a nozzle flow resolved by 360 million cells and distributed to two GPU nodes we obtain ~36 GLUPs.
As this is the first public release of GPU support in OpenLB, not all features are currently supported. However, due to extensive refactoring of our Dynamics and Post Processor concepts the vast majority of local collision steps and a core set of non local boundaries work transparently on GPUs. This includes support for large scale simulations on multi GPU clusters.
Existing legacy post processors are straight forward to adapt to the new approach. The core idea of which is to implement both local- and non-local cell operations as abstract templates accepting the concept of a cell instead of a specific implementation thereof. For further details see the userguide, specifically the sections on Dynamics and Post Processors.
Examples that work on GPUs without changing a single line of code include:
On CPUs, all existing post processors, dynamics and all other features continue to be supported.
We are hard at work expanding the list of GPU-aware features and examples, specifically full boundary condition coverage as well as particle and multi-lattice coupling. Feel free in contacting us if you are interested in joining our efforts on further developing this or any other aspect of the open source framework OpenLB.
A. Kummerländer, S. Avis, H. Kusumaatmaja, F. Bukreev, D. Dapelo, S. Großmann, N. Hafen, C. Holeksa, A. Husfeldt, J. Jeßberger, L. Kronberg, J. Marquardt, J. Mödl, J. Nguyen, T. Pertzel, S. Simonis, L. Springmann, N. Suntoyo, D. Teutscher, M. Zhong and M.J. Krause.
OpenLB Release 1.5: Open Source Lattice Boltzmann Code. Version 1.5. Apr. 2022.
doi: 10.5281/zenodo.6469606. url: https://doi.org/10.5281/zenodo.6469606
PS: Please consider joining the developer team by contributing your code. Together we can strengthen the LB community by sharing our research in an open and reproducible way! Feel free to contact us here.