the stencil can now be updated only for the primary degrees of freedom
and output modules can specify that they do not need extensive
quantities (which allows to speed up the writing code if none require
them.)
Also, all loops over the grid are now threaded (or rather, are
supposed to be), so openMP should be better utilized during the
linearization stage.