This commit adds a new public member function
SatfuncConsistencyChecks<>::collectFailures(root, comm)
which aggregates consistency check violations from all ranks in the
MPI communication object 'comm' onto rank 'root' of 'comm'. This
amounts to summing the total number of violations from all ranks and
potentially resampling the failure points for reporting purposes.
To this end, extract the body of function processViolation() into a
general helper which performs reservoir sampling and records point
IDs and which uses a call-back function to populate the check values
associated to a single failed check. Re-implement the original
function in terms of this helper by wrapping exportCheckValues() in
a lambda function. Extract similar helpers for numPoints() and
anyFailedChecks(), and add a new helper function
SatfuncConsistencyChecks<>::incorporateRankViolations()
which brings sampled points from an MPI rank into the 'root's
internal data structures.
One caveat applies here. Our current approach to collecting check
failures implies that calling member function reportFailures() is
safe only on the 'root' process in a parallel run. On the other
hand functions anyFailedChecks() and anyFailedCriticalChecks() are
safe, and guaranteed to return the same answer, on all MPI ranks.
On a final note, the internal helper functions are at present mostly
implemented in terms of non-owning pointers. I intend to switch to
using 'std::span<>' once we enable C++20 mode.
the outputs will be generated when needed by the custom_command
since these outputs are not intended to be edited, there is no reason
to hipify them up front. in particular this removes the long sequential
process at configure time, allowing all hipification to run in parallel
(if using multiple build jobs, ie. ninja -jx or make -jx)
The intention is that this will ultimately replace the existing
RelpermDiagnostics component which does not really work in parallel
and which does not report enough context to help diagnose underlying
issues. For now, though, we just add the shell of a new set of
checks and hook that up to the build.
Class SatfuncConsistencyChecks<Scalar> manages a configurable set of
consistency checks, the implementations of which must publicly
derive from SatfuncConsistencyChecks<Scalar>::Check. Client code
will configure a set of checks by first calling
SatfuncConsistencyChecks<Scalar>::resetCheckSet()
then register individual checks by calling
SatfuncConsistencyChecks<Scalar>::addCheck()
and finally build requisite internal structures by calling
SatfuncConsistencyChecks<Scalar>::finaliseCheckSet()
Client code will then run the checks by calling
SatfuncConsistencyChecks<Scalar>::checkEndpoints()
typically in a loop. Class SatfuncConsistencyChecks<Scalar> will
count consistency check failures and attribute these to each
individual check as needed. We also maintain separate counts for
"Standard" and "Critical" failures. The former will typically
generate warnings while the latter will typically cause the
simulation run to stop. Individual checks get to decide which check
is "Critical", and client code gets to decide how to respond to
"Critical" failures.
Member function SatfuncConsistencyChecks<Scalar>::reportFailures()
will generate a textual report of the known set of consistency check
failures at a give severity level.
As an internal implementation detail, SatfuncConsistencyChecks uses
"reservoir sampling"
(https://en.wikipedia.org/wiki/Reservoir_sampling) to track details
about individual failed checks. We maintain at most a fixed number
of individual points (constructor argument).
This commits adds cmake functionality that can
hipify the cuistl framework to support AMD GPUs.
Some tests have been written as HIP does not mirror
CUDA exactly.
CONVERT_CUDA_TO_HIP is the new CMAKE argument.
CMAKE version is increased to include HIP
as a language (3.21 required).
A macro is added to create a layer of indirection
that will make only cuistl files that have been
changed rehipified.
Some BDA stuff is extracted to make sure CUDA
is not accidentally included.
and add a dedicated header.
this way we don't need to pull in FlowMain.hpp for the prototypes,
avoiding pulling in the entire simulator machinery just to build
some simple utility functions.
The initial use case is calculating the phase-filled pore-volume
weighted average of the fluid mass densities per PVT region. This
value goes into calculating depth-corrected per-cell phase pressure
values such as the BPPO and BPPG summary vectors.
This class manages a single linear array which separately tracks the
averages' numerators and denominators as running sums per region and
region set. We pick this data structure to simplify the cross-rank
reduction needed in MPI parallel runs. Client code is expected to
add individual per-cell and per-phase contributions using the
addCell() member function and then call the accumulateParallel()
member to affect the cross-rank reduction. The averages will then
be available through the fieldValue() and value() member functions.
As a further view towards the initial use case, we track two
different types of average per phase--one for the phase-filled
volume and one for the pore-volume filled volume. The latter is the
average we would get for the case of the phase saturation being one
throughout the region. This alternative value is the fallback
option for the case of the phase saturation being identically zero
throughout the region.
preconditioner. Uses graph coloring to exploit
parallelism in upper and triangular solves when
computing a diagonal approximate inverse of a
sparse matrix. Supports blocksizes up to 3.
This makes them available for use in other places. The function
std::string to_string(const ConvergenceReport::WellFailure& wf) is new,
but uses the format already established.
Invokes Zoltan library and requires MPI. Client code constructs an
abstract connectivity graph by defining connections/edges through
the 'registerConnection()' member function. May also impose a
restriction that certain cells/vertices be placed in the same
domain/block in the resulting partition. Client code must supply a
callback function that defines globally unique cell/vertex/object
IDs, across all MPI ranks, for each vertex in the connectivity
graph.
Member function 'partitionElement()' forms the resulting partition
vector, the size of which is the total number of objects visible to
the local rank-typically the number of cells owned by the rank, and
the number of overlap cells--i.e., the size of the local grid view.
Implement calls to cuBlas, cuSparse and implement necessary
CUDA kernels to perform a single iteration of the jacobi preconditioner.
Add tests that verify new kernels and the preconditioner in its totality.
The preconditioner is verified on 2x2 and 3x3 blocks, which as of now
are the only supported sizes. 1x1 are not supported because cuSparse
does not support it.
added access to DUNE mesh geometry and passing through data to Damaris;
Updated command line so users can specifiy Python or Paraview script names and other paramaters that control Damaris
- Simulation name
- Number of dedicated cores or dedicated nodes
- Shared memory region size
- switch to turn off HDF5 output.
- Damaris logging level
Step one for moving Damaris calls out of EclWriter class and into its own DamarisWriter class;
EclProblem now calls both writeOutput methods and passes in the data::Solution object;
Add fix for first writeOutput() call not having PRESSURE data available;
data::Solution is now passed by rvalue ref into eclWriter::writeOutput();
guard added to prevent inclusion of damariswriter.hh