SuperLatticePhysHeatFlux3D is a functor which means that it operates only on CPU side data. Any difference in results between CPU and GPU would have to come from the model side (be it a bug or FPT implementation differences).
Just to make sure: You are synchronizing the relevant data from GPU to CPU (setProcessingContext) prior to evaluating the functor in your case?