There is no need to restrict PRT file flow reports to those wells
which are active on the current rank since the SummaryState object
holds information for all wells active at the current report step.
This commit splits the production, injection, and cumulative
*Report_() functions into three logically distinct parts, one for
outputting the report header (begin*Report_()), one for outputting a
single report record (output*ReportRecord_()), and one for ending
the report (end*Report_()). This simplifies the logic of the
*Record_() member functions since they no longer need to infer the
context which is already available in the caller and can perform a
more narrow task than before.
With this separation we're also able to remove the dashed lines
which would previously separate each report record, thereby creating
PRT file report sheets which have a more expected layout.
Moreover, as an aid to future maintenance, we also factor out common
code for the well and group cases in each *Record_() function.
Finally, fix a unit conversion problem in the report values for
cumulative gas quantities. The sheet header states that we should
be outputting values in 'MM' prefixed units, but we were only
scaling the gas values by a factor of 1000 instead of 1000*1000. In
other words, the injection and production gas values in the
cumulative sheet were off by a factor of 1000.
The StreamLog::addMessageUnconditionally() member function will end
each message with a newline (std::endl) so we should not add such
newlines ourselves. The extra newline characters produce spurious
blank lines in the report sheets, e.g., for the "PRODUCTION REPORT".
This commit removes the last newline character from each report
request, thus deferring that responsibility to OpmLog::note()
instead. Doing so, however, means we have take a little more care
with the first line of each report lest we create report sheets
which are smushed together.
Implement graphcoloring to expose rows in level sets that that can be
executed in parallel during the sparse triangular solves.
Add copy of A matrix that is reordered to ensure continuous memory reads
when traversing the matrix in level set order.
TODO: add number of threads available as constructor argument in DILU
Implement calls to cuBlas, cuSparse and implement necessary
CUDA kernels to perform a single iteration of the jacobi preconditioner.
Add tests that verify new kernels and the preconditioner in its totality.
The preconditioner is verified on 2x2 and 3x3 blocks, which as of now
are the only supported sizes. 1x1 are not supported because cuSparse
does not support it.
This commit adds a parallel calculation object derived from the serial
PAvgCalculator class. This parallel version is aware of MPI
communicators and knows how to aggregate contributions from wells that
might be distributed across ranks.
We also add a wrapper class, ParallelWBPCalculation, which knows how to
exchange information from PAvgCalculatorCollection objects on different
ranks and, especially, how to properly prune inactive cells/connections.