Kjetil Olsen Lye
9bc7155cf3
Merge pull request #5552 from multitalentloes/add_mixed_precision_ilu0_and_dilu_on_gpu
...
Add mixed precision ilu0 on gpu
2024-10-10 13:03:14 +02:00
Vegard Kippe
099dabd8a9
Actually update the criterion to use fixed ordering..
2024-10-04 00:09:04 +02:00
Tobias Meyer Andersen
55c20dbddd
Implement mixed precision GPU ILU0
2024-09-30 16:24:49 +02:00
Tobias Meyer Andersen
0bab02f809
update name of opm cuilu0 to match gpuistl
2024-09-13 09:15:19 +02:00
Kjetil Olsen Lye
f97389d1b5
Merge pull request #5554 from multitalentloes/refactor_cuistl
...
refactor cuistl to gpuistl
2024-08-26 09:55:13 +02:00
Tobias Meyer Andersen
d14fed904a
fix typo
2024-08-23 14:42:37 +02:00
Tobias Meyer Andersen
fba1858f42
refactor cuvector
2024-08-22 15:20:20 +02:00
Tobias Meyer Andersen
0c1ea3ee4d
refactor cuseqilu0
2024-08-22 15:07:53 +02:00
Tobias Meyer Andersen
158619083e
refacor cujac
2024-08-22 14:40:23 +02:00
Tobias Meyer Andersen
d17ee3315b
refactor CuDILU
2024-08-22 14:28:33 +02:00
Tobias Meyer Andersen
67bc9e8f34
refactor CuBlockPreconditioner
2024-08-22 13:58:35 +02:00
Tobias Meyer Andersen
3f4ae4ddf4
refactor cuistl namespace
2024-08-22 13:52:50 +02:00
Arne Morten Kvarving
1cc27754d8
FlexibleSolver: optionally instantiate for float
...
PreconditionerFactory: optionally instantiate for float
these need to go in the same commit due to circular dependencies
2024-08-21 09:34:28 +02:00
Tobias Meyer Andersen
7a30aaa46e
Add an OPM implementation of ILU0
...
improve file structure in cuistl
run clang-format
2024-08-09 15:52:42 +02:00
Tobias Meyer Andersen
3cb8298e3a
Pick blocksize automatically for CUDA cards.
...
Calibrate the best size for AMD cards.
This will be improved in a following PR
2024-06-28 14:36:00 +02:00
Kjetil Olsen Lye
9b414419e7
Merge pull request #5404 from multitalentloes/add_dilu_LU_splitting
...
Add cudilu lu splitting
2024-06-27 14:30:45 +02:00
andrthu
6c62753803
Ghost entries skipped for ilu apply and GL operator in AMG/CPR hierarchy.
...
This works since the ghost entries are the last entries
2024-06-07 14:40:53 +02:00
Tobias Meyer Andersen
9b2f41ad96
Add option to split the matrix into diagonal,
...
strictly lower and stricly upper part.
Add tests checking that the result matches
the CPU dilu implementation.
2024-06-05 13:35:54 +02:00
Arne Morten Kvarving
b9ee637d78
PreconditionerFactory: use Scalar type from operator
2024-05-24 14:03:28 +02:00
Arne Morten Kvarving
b7bc7b7bf5
Pressure(Bhp)TransferPolicy: template Scalar type
2024-05-24 14:03:28 +02:00
Tobias Meyer Andersen
e9d6b326cc
Add HIP support for AMD GPUs
...
This commits adds cmake functionality that can
hipify the cuistl framework to support AMD GPUs.
Some tests have been written as HIP does not mirror
CUDA exactly.
CONVERT_CUDA_TO_HIP is the new CMAKE argument.
CMAKE version is increased to include HIP
as a language (3.21 required).
A macro is added to create a layer of indirection
that will make only cuistl files that have been
changed rehipified.
Some BDA stuff is extracted to make sure CUDA
is not accidentally included.
2024-05-06 15:56:53 +02:00
Tobias Meyer Andersen
296f41ecc0
Make function that infers templates, avoid use of new
2024-04-16 15:39:17 +02:00
Tobias Meyer Andersen
9ab15e3ff9
bugfix: make famg reconstruct on update
2024-04-15 16:24:43 +02:00
Tobias Meyer Andersen
0079a17889
re-enable DuneILU for multiprocess
2024-04-15 16:06:37 +02:00
Tobias Meyer Andersen
e275c637f5
Proof Of Concept generic Preconditioner with update
2024-04-15 15:27:37 +02:00
Tobias Meyer Andersen
2a7251efc5
change pointers to const references
2024-04-12 15:39:35 +02:00
Tobias Meyer Andersen
8177400602
clang-format PreconditionerFactory_impl
2024-04-11 15:09:09 +02:00
Tobias Meyer Andersen
1685f928f7
add RebuildOnUpdate for single process preconditioners. Also refactor wrapPreconditioner to match type of wrapper
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
6cfe647c81
clean up and fix multiprocess RebuildOnUpdate wrapper
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
71d58afc0e
Add a valid wrapper through OwningBlockPreconditioner that rebuiilds the preconditioner on updates. Still has excessive matrix copy and misses a wrapper function
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
6b73856fd9
update comments
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
df401e52b8
add jac smoother
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
6c0ee61d6f
add ILUn smoother
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
fd6319fe38
add SSOR smoother
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
f6c539f819
add SOR smoother
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
b02e001ae3
add gs preconditioner for amg and kamg
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
030720f855
add dune ILU0 for multiprocess simulations
2024-04-11 15:07:17 +02:00
Tobias Meyer Andersen
8b5ab973e2
Add dune ILU when using only one process
2024-04-11 15:07:17 +02:00
Bård Skaflestad
859db850c0
Merge pull request #5111 from lisajulia/feature/deterministic-indicesSyncer
...
Construct the matrices in the AMG hierarchy with deterministic indices.
2024-01-29 13:37:54 +01:00
Lisa Julia Nebel
74608147c0
Add comment on trailing return type with decltype that is used for detecting the existence of setUseFixedOrder member function
2024-01-26 14:29:15 +01:00
Tobias Meyer Andersen
4b0dd54f15
Add CUDA implementation of the DILU
...
preconditioner. Uses graph coloring to exploit
parallelism in upper and triangular solves when
computing a diagonal approximate inverse of a
sparse matrix. Supports blocksizes up to 3.
2024-01-25 14:26:38 +01:00
Lisa Julia Nebel
60b0a33bd4
Construct the matrices in the AMG hierarchy (created in the PreconditionerFactory) with deterministic indices if possible.
...
The function 'setUseFixedOrder' is called if it is defined and the matrices in the AMG hierarchy are constructed with deterministic indices.
If it is not defined yet, it is not called and the matrices in the AMG hierarchy are constructed with non-deterministic indices.
2024-01-25 11:32:09 +01:00
Tobias Meyer Andersen
5f6c97ff3b
add OpenMP parallelized version of DILU.
...
Implement graphcoloring to expose rows in level sets that that can be
executed in parallel during the sparse triangular solves.
Add copy of A matrix that is reordered to ensure continuous memory reads
when traversing the matrix in level set order.
TODO: add number of threads available as constructor argument in DILU
2023-11-21 15:41:53 +01:00
jakobtorben
ab0ca76194
Add DILU preconditioner
2023-10-18 14:30:17 +02:00
Tobias Meyer Andersen
1e4b0e97ee
Add jacobi preconditioner that runs on the GPU.
...
Implement calls to cuBlas, cuSparse and implement necessary
CUDA kernels to perform a single iteration of the jacobi preconditioner.
Add tests that verify new kernels and the preconditioner in its totality.
The preconditioner is verified on 2x2 and 3x3 blocks, which as of now
are the only supported sizes. 1x1 are not supported because cuSparse
does not support it.
2023-10-13 10:31:17 +02:00
Arne Morten Kvarving
92fa9577da
consistently use std::size_t
2023-08-15 09:32:10 +02:00
hnil
c065d34d0e
-- added more timing to get better coverage of amg solver
...
-- added includes needed
2023-07-24 12:28:08 +02:00
Kjetil Olsen Lye
e35318b6bb
Removed unused block_size
2023-05-31 21:36:15 +02:00
Kjetil Olsen Lye
042172588d
Added CuSeqILU0 as a parallel preconditioner as well.
2023-05-31 16:28:51 +02:00
Kjetil Olsen Lye
dea49a5406
Added CUILU0 to the PreconditionerFactory.
2023-05-30 11:50:02 +02:00