Dear Mathias,rnrnThe parallel implementations by MPI works very well and present a desirable speedup. rnrnHowever, when I am trying to run OpenLB with parallel mode OMP, I cannot get expected speedup (linear relationship between threads and speed in most cases).rnrnTo run LBM code by OMP, I simply modified the Makefile.inc in the home directory rnrnPARALLEL_MODE := OMPrnrnand set the environmental variables, thread number to 1, 2 , 3 ,4 ,6 ,8 respectively. By the way, there are 8 cores on my computer for sure.rnrn I monitored the performance of CPUs meanwhile. There is a good speed up from 1 thread to 2 threads, further to 3 threads, but no improvement appears when increase the threads beyond 3. rnrnI did the same work on the cluster (48 cores on each node), threads number is set as 48, there seems no improvement compared with the speed with 4 threads.rnrnCould you explain me what happened in regard to this issue. And what else should I do to improve the OMP parallel performance.rnrnLooking forward to your reply!rnrnBest regards,rnrnJepsonrn