Previously the substep summary reports were cumulative, misleading the user.
Also, made output a little more compact and readable, ensuring numbers line up
unless unusually many digits are needed for times and iteration counts.
That version does not provide a default constructor for
CollectiveCommunication, Therefore we now use
MPIHelper::getCollectiveCommunication() for the default
constructor argument.