The LoadBalancer::size method returns the rank-local number of cuboids, i.e. the number of blocks that each MPI process individually needs to process. The sum of the individual sizes is the total number of cuboids in the block decomposition.
For most situations it is sufficient to divide the problem into one block per process which is what you observe here.