preconditioner. Uses graph coloring to exploit
parallelism in upper and triangular solves when
computing a diagonal approximate inverse of a
sparse matrix. Supports blocksizes up to 3.
The function 'setUseFixedOrder' is called if it is defined and the matrices in the AMG hierarchy are constructed with deterministic indices.
If it is not defined yet, it is not called and the matrices in the AMG hierarchy are constructed with non-deterministic indices.
Implement graphcoloring to expose rows in level sets that that can be
executed in parallel during the sparse triangular solves.
Add copy of A matrix that is reordered to ensure continuous memory reads
when traversing the matrix in level set order.
TODO: add number of threads available as constructor argument in DILU
Implement calls to cuBlas, cuSparse and implement necessary
CUDA kernels to perform a single iteration of the jacobi preconditioner.
Add tests that verify new kernels and the preconditioner in its totality.
The preconditioner is verified on 2x2 and 3x3 blocks, which as of now
are the only supported sizes. 1x1 are not supported because cuSparse
does not support it.
i chose to split in a separate _impl file because this code is so
generic that there may be downstream users who want to use on other
matrix types than what we use in opm-simulators.