Commit Graph

10 Commits

Author SHA1 Message Date
Kjetil Olsen Lye
c4f686227b
Merge pull request #5451 from multitalentloes/generalize_thread_block_tuner
Generalize thread block tuner
2024-08-22 12:56:14 +02:00
jakobtorben
5d54c50ba0 Add method for defining if preconditioners should be recreated 2024-08-20 17:57:38 +02:00
Tobias Meyer Andersen
ae4e6a65fc make autotuner use lambda that only depends on blocksize 2024-08-20 15:06:59 +02:00
Tobias Meyer Andersen
14ea44246a add autotuner 2024-08-20 13:35:33 +02:00
Tobias Meyer Andersen
7a30aaa46e Add an OPM implementation of ILU0
improve file structure in cuistl
run clang-format
2024-08-09 15:52:42 +02:00
Tobias Meyer Andersen
3cb8298e3a Pick blocksize automatically for CUDA cards.
Calibrate the best size for AMD cards.
This will be improved in a following PR
2024-06-28 14:36:00 +02:00
Tobias Meyer Andersen
605e32c54b use camelCase, remove commented code 2024-06-26 15:34:47 +02:00
Tobias Meyer Andersen
d6f8678617 use unique_ptr consistently for delayed instantiation 2024-06-26 15:31:52 +02:00
Tobias Meyer Andersen
9b2f41ad96 Add option to split the matrix into diagonal,
strictly lower and stricly upper part.
Add tests checking that the result matches
the CPU dilu implementation.
2024-06-05 13:35:54 +02:00
Tobias Meyer Andersen
4b0dd54f15 Add CUDA implementation of the DILU
preconditioner. Uses graph coloring to exploit
parallelism in upper and triangular solves when
computing a diagonal approximate inverse of a
sparse matrix. Supports blocksizes up to 3.
2024-01-25 14:26:38 +01:00