322 Commits

Author SHA1 Message Date
James McClure
e7b9407586 Merge branch 'master' into SYCL 2023-04-03 15:49:34 -04:00
James McClure
2989e890fa fixing merge 2023-04-03 09:05:33 -04:00
James McClure
68e475fdf8 make sure permeability always includes square of porosity 2023-04-03 08:49:19 -04:00
James McClure
ae9885266b set fractional flow increment for seed water 2023-03-30 15:31:17 -04:00
amitkumarhyd
b09731a5ef Migrated SYCL port with Makefile config 2023-03-24 05:34:19 -07:00
James McClure
bf076ca633 Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA into merge 2023-01-21 08:48:25 -05:00
James McClure
2a8724f42d update sample script 2023-01-21 08:48:21 -05:00
James McClure
b2994deeb8 Merge branch 'master' of github.com:OPM/LBPM 2023-01-21 08:43:42 -05:00
James McClure
161da74bbb update cell docs 2023-01-18 07:22:33 -05:00
James McClure
b0d4f54efc update cell do s 2023-01-17 09:02:55 -05:00
James McClure
4a90275a14 fix bullet list in cell doc 2023-01-16 15:43:48 -05:00
James McClure
b6617f7730 add membrane to cell model doc 2023-01-16 15:41:53 -05:00
James McClure
e803cdfa9d add membrane to cell model doc 2023-01-16 15:38:47 -05:00
James McClure
7e286f16f1 update docs for cell model 2023-01-16 13:29:58 -05:00
James McClure
57736ced88 create cell model 2023-01-16 07:19:20 -05:00
James McClure
b2ac66739f update nernst-planck docs 2023-01-16 07:18:58 -05:00
James McClure
720e7fa195 add potential BC for poisson 2022-11-02 16:25:14 -04:00
James McClure
ce7714071f clena up 2022-11-02 15:55:58 -04:00
James McClure
17698b7dd5 fix for hip poisson 2022-10-29 13:57:54 -04:00
James E McClure
9749600540 fix poisson cuda 2022-10-29 13:55:30 -04:00
James McClure
14c1037d7d add GPU voltage bc 2022-10-28 07:20:45 -04:00
James McClure
111529ff5e add voltage BC for d3q19 poisson solver 2022-10-27 08:15:21 -04:00
James McClure
95233d61a7 update configure script 2022-10-26 20:24:56 -04:00
James McClure
79484ab8fa update poisson solver 2022-10-26 20:24:45 -04:00
James McClure
613a43c309 vis capabilities for poisson, default d3q19 2022-10-26 16:17:23 -04:00
James McClure
f1ceb930ee working on read/write for membrane coefficients 2022-10-26 16:16:59 -04:00
James E. McClure
fac681dffe Rc 08.15.2022 (#70)
* summit configure

* add spock scripts to FOM

* get new models to build with hip

* add hip slipping bc

* testing communication on spock

* update spock build based on olcf docs

* update configure & test scripts for spock

* Fixing potential bugs with communication

* Adding simple test of GPU aware MPI

* some changes to configure for spock

* Modifying GPU aware MPI test to send multiple messages

* playing with spock Gpu test

* added gpu wrapper test

* Cleaning up some compiler warnings

* add barrier between pack / MPI send

* Updating build to support HIP as a language

* fixing gpu mpi sync

* Adding script

* local spock changes

* add membrane class

* update membrane structure

* membrane communications

* working on new comm data structures

* add membrane unit test

* membrane compiles

* membrane test

* Updating hip port to match cuda

* update summit config

* update summit config

* add configure script for crusher

* update membrane test

* update membrane test

* convention for inside / outside membrane link direction

* working on membrane comm

* try to fix time conversion factor for Poisson solver; to be built and tested

* fix dumb typo

* update summit config

* tune launch for crusher

* tune launch for mrt on crusher

* update color

* summit script with specific module versions

* update crusher config

* add crusher examples

* add dense case for crusher

* Fixing some quick annoying compile warnings

* fix binding in example

* working on fix

* Adding simple crusher test

* Adding new crusher MPI test

* disable MPI thread multiple for crusher

* updates to crusher configure

* cpu test for crusher

* Working on standalone reproducer for MPI bug

* More work on creating standalone test

* More work on creating standalone test

* More work on creating standalone test

* Reverting TestCrusher2, standalone version passes (TestCrusher3.cpp), need to figure out why

* Working on standalone MPI test on crusher

* Working on standalone MPI test on crusher

* Getting closer to stand alone test

* Still trying to create standalone reproducer

* hang fix / workaround

* Created standalone MPI failure test

* Removing TestCrusher tests, the bug deals with the StackTrace which we disable the multistack trace for now.  Moving the test out of LBPM

* fix sendcount / recvcount

* Testing persistent communication

* Updating calculation of bandwidth

* crusher hackathon final version

* working on membrane communication structures

* add cell simulator

* added cell simulator

* make sure halo is filled when measuring object

* add membrane transport function for d3q7

* add membrane unpack function

* poking at MF issue

* update crusher build

* membrane data structures compiling

* update to membrane capability

* update comments in ion model

* fix dumb print bug

* clean up relabel

* adding membrane functions

* move membrane to common folder

* membrane structure in IonModel

* membrane structure in IonModel

* try at membrane simulator

* add python script to generate bubble

* add python script to generate bubble

* cell simulator runs

* read input files

* add single cell example

* refining cell example

* start on cuda function

* werkin

* start on cuda function

* start on hip function

* updates and fix for user input reader

* update cell example

* add sigmoid to ion equilibrium dist

* cuda build succeeds

* update crusher script

* getting ready to merge gpu

* refactor compact AA routines for testing

* add testing functions to ScaLBL

* testing membrane ion transport

* membrane transport test passing

* membrane starts working ok...

* original wang poisson solver (broken)

* rex d3q19 (broken)

* tau from wang paper

* still broken wang

* d319 poisson works good

* Poisson working pretty good now

* initialize nernst-planck simulator; to be built subject to debugging

* fix a few syntax bugs and build passed

* Poisson solver; enable specifying initial values

* update cell example

* add GPU functions for d3q19 poisson

* fix dumb bugs

* fix bugs in initializing electric potential; the Psi on solid was accidentally overwritten before.

* small change

* fix bugs in importing ion model's dummy velocity

* add membrane concentration init

* remove bad warnings

* remove print staetements

* add barriers to poisson solver

* update print

* print membrane input concentration

* read Membrane ion concentration list

* fix bad ref to D3Q7

* update error analysis for Poisson solver

* fix typo

* update hip poisson solver

* deprecate old error methods

* a bunch of summit debug things to roll back later

* fix poisson typo

* update hip

* debug crusher

* debug charge density problems on crusher

* fix charge density (i think)

* remove Stokes solver from cell simulator; need to test build

* update cpu ion valence

* added membrane properties to input db

* update cell db

* update executable list and NP_cell simulator

* correct use_membrane functionality

* add functionality for user to choose either D3Q7 or D3Q19 lattice for Poisson;to be built and tested

* build passed

* make further corrections

* correct D3Q7 Poisson LB algorithm

* correct ion LB collison

* udpate output precision

* add more tweaks for cell simulator

* update print-out

* this makes mpi hang error explicit;to be debugged

* cleanup with help from valgrind

* update to cell vis routine

* add hip for ion update

* fix missing bracket

* add new ion code for cuda

* add barrier to membrane transport

* debug gpu launch issue for ion

* debug gpu

* add functions to copy send / recv list from ScaLBL

* updating membrane communication structure

* membrane test works with new comm

* communication seems to work

* add sample files for plane membrane

* update gpu routines first try

* update hip

* multiple nvidia gpu working with membrane

* added membrane analysis capability

* added support for swc file

* support for SWC input format

* swc reader works with MPI

* shift swc data

* SWC reader update

* SWC reader update 2

* add offset to Domain for swc

* add input files for simple bacteria

* add performance counters to ion / poisson solvers

* fix bug with SWC

* add BC to poisson solver

* fix compiler warnings

* fix memory leaks

* fix zlib download path

* Fixing memory leak

* Fixing memory leaks

* restart for Poisson model

* fix bug in ion model restart

* trying to fix yaml

* fix workflow indentation

* porosity factor in effperm

* porosity factor in effperm

* porosity factor in effperm

* porosity factor in effperm

Co-authored-by: James E McClure <mcclurej@vt.edu>
Co-authored-by: Mark Berrill <berrillma@ornl.gov>
Co-authored-by: Zhe Rex Li <zhe.rex.li@gmail.com>
Co-authored-by: Zhe Li <zzl109@gadi-login-01.gadi.nci.org.au>
Co-authored-by: Zhe Li <zzl109@gadi-login-04.gadi.nci.org.au>
Co-authored-by: Zhe Li <zzl109@gadi-login-02.gadi.nci.org.au>
Co-authored-by: Zhe Li <zzl109@gadi-login-06.gadi.nci.org.au>
Co-authored-by: Zhe Li <zzl109@gadi-login-05.gadi.nci.org.au>
2022-09-07 21:44:16 +02:00
James McClure
e9096dbfc3 update docs 2022-08-31 14:08:59 -04:00
James McClure
ffe4f794de fix swc stuff try 3 2022-08-27 16:30:03 -04:00
James McClure
9229de2875 fix swc stuff try 2 2022-08-27 16:06:30 -04:00
James McClure
fc4af2d712 fix swc thing 2022-08-27 15:40:32 -04:00
James McClure
372ebd3af2 add flat field to accelerate Poisson 2022-08-25 15:46:47 -04:00
James E McClure
045a955626 fix cuda thing 2022-08-22 22:00:19 -04:00
James McClure
b423bcb42b minor cleanup of gpu poisson 2022-08-21 08:53:05 -04:00
James McClure
2b2bdee447 fix error in gpu poisson 2022-08-16 11:23:30 -04:00
James McClure
d69bff263c fix minor merge conflict 2022-08-15 05:17:47 -04:00
James McClure
4a016eee6c porosity factor in effperm 2022-08-12 18:34:50 -04:00
James McClure
d50c98c71a porosity factor in effperm 2022-08-12 18:33:49 -04:00
James McClure
6ad23248fd porosity factor in effperm 2022-08-12 18:31:53 -04:00
James McClure
189408f769 porosity factor in effperm 2022-08-12 18:29:45 -04:00
James McClure
f233ec616b fix workflow indentation 2022-08-07 09:11:41 -04:00
James McClure
2a8c88496c trying to fix yaml 2022-08-05 21:20:19 -04:00
James McClure
58ae63fe87 fix bug in ion model restart 2022-08-02 15:36:35 -04:00
James McClure
777e216b35 restart for Poisson model 2022-07-29 18:11:32 -04:00
Mark Berrill
3cc12ac36e Fixing memory leaks 2022-07-13 11:20:53 -04:00
Mark Berrill
ed8f5684fd Fixing memory leak 2022-07-13 10:53:19 -04:00
James McClure
c148f907eb fix zlib download path 2022-07-13 07:55:37 -04:00
James McClure
5204f48d45 fix memory leaks 2022-07-12 18:44:36 -04:00
James McClure
b7910dbefe fix compiler warnings 2022-07-06 21:33:59 -04:00
James McClure
5a37f7865f add BC to poisson solver 2022-07-06 19:35:08 -04:00
James McClure
a9ed4a3b97 fix bug with SWC 2022-07-04 19:46:14 -04:00
James McClure
1c903a3380 add performance counters to ion / poisson solvers 2022-06-11 20:49:01 -04:00
James McClure
be8f508b64 add input files for simple bacteria 2022-05-26 16:43:34 -04:00
James McClure
dc78491a9c add offset to Domain for swc 2022-05-25 17:59:08 -04:00
James McClure
df27167212 SWC reader update 2 2022-05-25 16:28:58 -04:00
James McClure
57fc4cc8e1 SWC reader update 2022-05-25 16:21:34 -04:00
James McClure
e818ade293 shift swc data 2022-05-16 21:43:53 -04:00
James McClure
175e7bd00b swc reader works with MPI 2022-05-16 11:22:38 -04:00
James McClure
c4a97c0589 support for SWC input format 2022-05-16 11:17:47 -04:00
James McClure
0e65364954 added support for swc file 2022-05-15 23:00:23 -04:00
James McClure
2acaa335aa added membrane analysis capability 2022-05-13 20:44:33 -04:00
James E McClure
b6227dd823 multiple nvidia gpu working with membrane 2022-05-12 20:50:05 -04:00
James McClure
cf3bc417ce Merge branch 'test_poisson' of github.com:JamesEMcClure/LBPM-WIA into test_poisson 2022-05-12 07:09:39 -04:00
James McClure
ffe4bdd917 update hip 2022-05-12 06:58:10 -04:00
James McClure
0e769186a5 update gpu routines first try 2022-05-12 06:54:55 -04:00
Zhe Rex Li
cb995c7d00 add sample files for plane membrane 2022-05-12 15:39:50 +10:00
James McClure
ad8c5f6e26 communication seems to work 2022-05-12 00:52:34 -04:00
James McClure
29e4c76561 membrane test works with new comm 2022-05-12 00:31:22 -04:00
James McClure
7c790e8802 updating membrane communication structure 2022-05-11 23:37:18 -04:00
James McClure
2894b740d0 add functions to copy send / recv list from ScaLBL 2022-05-11 17:12:23 -04:00
James McClure
4661cbdce4 debug gpu 2022-05-11 14:05:16 -04:00
James McClure
50c7429995 debug gpu launch issue for ion 2022-05-11 10:32:53 -04:00
James McClure
e677b0395f add barrier to membrane transport 2022-05-09 06:32:58 -04:00
James McClure
9e74f49812 add new ion code for cuda 2022-05-08 05:49:23 -04:00
James McClure
cf2b69e7ee fix missing bracket 2022-05-07 19:36:14 -04:00
James McClure
6f7740fc3e Merge branch 'tmp' into test_poisson 2022-05-07 19:32:54 -04:00
James McClure
9037f51c22 add hip for ion update 2022-05-07 19:32:51 -04:00
James McClure
459d6064c1 merge to membrane 2022-05-06 20:41:03 -04:00
James McClure
9f76e7b1e8 update to cell vis routine 2022-05-06 19:03:19 -04:00
James McClure
c424e1d984 cleanup with help from valgrind 2022-05-06 16:21:37 -04:00
Zhe Li
ba6d438630 this makes mpi hang error explicit;to be debugged 2022-05-04 14:34:31 +10:00
Zhe Rex Li
01ffcce379 update print-out 2022-05-03 17:10:26 +10:00
Zhe Rex Li
de52cf53b9 add more tweaks for cell simulator 2022-05-03 16:32:42 +10:00
Zhe Li
9c2c216af8 udpate output precision 2022-05-03 15:55:40 +10:00
Zhe Rex Li
6959a02f7b correct ion LB collison 2022-05-03 15:00:00 +10:00
Zhe Rex Li
47db000ba6 correct D3Q7 Poisson LB algorithm 2022-04-29 20:36:38 +10:00
Zhe Li
025987e53b make further corrections 2022-04-28 23:59:13 +10:00
Zhe Li
bdd0efd36e build passed 2022-04-28 16:55:58 +10:00
Zhe Rex Li
a00a3606f7 add functionality for user to choose either D3Q7 or D3Q19 lattice for Poisson;to be built and tested 2022-04-28 16:21:04 +10:00
Zhe Li
678925ec15 correct use_membrane functionality 2022-04-28 15:22:23 +10:00
Zhe Li
754b1ad9d9 update executable list and NP_cell simulator 2022-04-27 14:04:11 +10:00
Zhe Rex Li
86be316977 merge membrane into test_poisson 2022-04-27 12:04:53 +10:00
James McClure
8ea560ca66 update cell db 2022-04-26 19:01:30 -04:00
James McClure
e3518e3482 added membrane properties to input db 2022-04-26 18:07:26 -04:00
James McClure
3e82370d6c update cpu ion valence 2022-04-26 06:37:08 -04:00
Zhe Rex Li
614b002725 remove Stokes solver from cell simulator; need to test build 2022-04-26 11:41:14 +10:00
Zhe Rex Li
17f80b4637 merge the latest membrane into test_poisson 2022-04-26 11:16:44 +10:00
James E McClure
429413ce3b fix charge density (i think) 2022-04-24 16:48:12 -04:00
James E McClure
3642a6ae9b debug charge density problems on crusher 2022-04-24 15:55:23 -04:00
James E McClure
9043751281 debug crusher 2022-04-24 14:55:59 -04:00
James E McClure
eadb420d06 update hip 2022-04-24 14:54:49 -04:00
James E McClure
91d8fbe751 Merge branch 'membrane' of github.com:JamesEMcClure/LBPM-WIA into membrane 2022-04-24 11:04:47 -04:00
James McClure
418bc82953 fix poisson typo 2022-04-24 11:04:33 -04:00
James E McClure
36a5204882 a bunch of summit debug things to roll back later 2022-04-24 11:04:11 -04:00
James McClure
dd5dbbd51c deprecate old error methods 2022-04-24 11:00:17 -04:00
James McClure
d964eee29c update hip poisson solver 2022-04-24 11:00:03 -04:00
James E McClure
0224ecdcff fix typo 2022-04-22 20:41:49 -04:00
James McClure
cf4f2b63da update error analysis for Poisson solver 2022-04-22 15:17:52 -04:00
James McClure
d3dc35c018 fix bad ref to D3Q7 2022-04-22 14:30:03 -04:00
James McClure
481a258fbd read Membrane ion concentration list 2022-04-19 10:53:16 -04:00
James McClure
b51968c677 print membrane input concentration 2022-04-19 07:46:53 -04:00
James McClure
2cbb4ce8cd update print 2022-04-19 07:33:21 -04:00
James McClure
7eaaced8d9 add barriers to poisson solver 2022-04-18 08:45:49 -04:00
James McClure
e613cba376 remove print staetements 2022-04-16 17:14:22 -04:00
James McClure
b183252947 remove bad warnings 2022-04-15 22:50:16 -04:00
James McClure
85e99df3f3 add membrane concentration init 2022-04-14 22:57:23 -04:00
Zhe Li
58e246c941 fix bugs in importing ion model's dummy velocity 2022-04-14 11:53:26 +10:00
James McClure
aa8783141d small change 2022-04-13 20:39:59 -04:00
Zhe Li
e7c227bd9f fix bugs in initializing electric potential; the Psi on solid was accidentally overwritten before. 2022-04-13 16:56:26 +10:00
Zhe Li
c281c0cfda fix dumb bugs 2022-04-13 14:53:23 +10:00
James McClure
adeecbd122 add GPU functions for d3q19 poisson 2022-04-12 23:29:46 -04:00
James McClure
8fd8799ff9 update cell example 2022-04-12 20:51:13 -04:00
Zhe Rex Li
0f68c118de Poisson solver; enable specifying initial values 2022-04-12 16:12:12 +10:00
Zhe Li
cc693c93f5 fix a few syntax bugs and build passed 2022-04-12 12:49:28 +10:00
Zhe Rex Li
e1ebbce812 initialize nernst-planck simulator; to be built subject to debugging 2022-04-12 11:40:39 +10:00
James McClure
032a21a872 Poisson working pretty good now 2022-04-10 22:21:34 -04:00
James McClure
d0e41bf834 d319 poisson works good 2022-04-10 17:25:29 -04:00
James McClure
eb1c5da99c still broken wang 2022-04-10 16:08:35 -04:00
James McClure
cd970d64a3 Merge branch 'membrane-rex-d3q19' into membrane-wang 2022-04-10 12:34:31 -04:00
James McClure
b182d04833 tau from wang paper 2022-04-10 11:32:53 -04:00
James McClure
b3dfc014d8 rex d3q19 (broken) 2022-04-10 11:05:53 -04:00
James McClure
9c63613373 original wang poisson solver (broken) 2022-04-10 09:51:02 -04:00
James McClure
2bb2be845a membrane starts working ok... 2022-04-08 16:44:37 -04:00
James McClure
25b17f996c membrane transport test passing 2022-04-06 22:43:42 -04:00
James McClure
9c013e6169 testing membrane ion transport 2022-04-06 22:25:05 -04:00
James McClure
cca06f7964 add testing functions to ScaLBL 2022-04-06 20:44:14 -04:00
James McClure
4758bdffa9 refactor compact AA routines for testing 2022-04-06 20:34:12 -04:00
James McClure
3715ed98b0 Merge branch 'membrane' of github.com:JamesEMcClure/LBPM-WIA into membrane 2022-04-01 06:58:49 -04:00
James McClure
39a04e38ab getting ready to merge gpu 2022-04-01 06:58:43 -04:00
James E McClure
b6c2e7de3a update crusher script 2022-03-31 07:23:04 -04:00
James McClure
75240ef7cd Merge branch 'master' of github.com:OPM/LBPM 2022-03-29 15:02:01 -04:00
James McClure
f5fdba8458 fix broken zlib download path 2022-03-29 15:01:43 -04:00
JamesEMcClure
68dd158689 Merge pull request #64 from thomaram/adjust_perm
Add perm converter
2022-03-29 04:49:27 -04:00
James E McClure
cd34f11c38 cuda build succeeds 2022-03-28 18:59:30 -04:00
James E McClure
f99af45c9b Merge branch 'membrane' into tmp 2022-03-27 16:03:26 -04:00
James E McClure
6911e97c29 merge master with crusher 2022-03-27 16:02:56 -04:00
James E McClure
de2fe8b3a9 Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA 2022-03-27 16:01:20 -04:00
Thomas Ramstad
24f069c43f Add perm converter
Convert from micron2 to mDarcy

 Changes to be committed:
	modified:   models/ColorModel.cpp
	modified:   models/ColorModel.h
	modified:   models/MRTModel.cpp
2022-03-24 23:39:46 +01:00
James McClure
8a0937e111 add sigmoid to ion equilibrium dist 2022-03-24 07:38:06 -04:00
James McClure
7172364660 update cell example 2022-03-22 20:03:34 -04:00
James McClure
2c3272e423 updates and fix for user input reader 2022-03-22 17:31:12 -04:00
James McClure
8ae69e4c1e start on hip function 2022-03-21 19:53:05 -04:00
James McClure
18602c7516 start on cuda function 2022-03-21 19:51:09 -04:00
James McClure
16275ce1b9 werkin 2022-03-21 19:44:21 -04:00
James McClure
0a1057926d start on cuda function 2022-03-21 19:44:04 -04:00
JamesEMcClure
70b12830cb Merge pull request #63 from thomaram/adjust_perm
Add adjusted perms
2022-03-21 14:16:13 -04:00
James McClure
286e779459 refining cell example 2022-03-20 18:01:19 -04:00
James McClure
4927e54707 add single cell example 2022-03-20 15:13:53 -04:00
James McClure
3522d35de1 read input files 2022-03-20 13:32:24 -04:00
James McClure
51c88f0055 cell simulator runs 2022-03-20 11:22:46 -04:00
James McClure
9e3a07d419 add python script to generate bubble 2022-03-20 09:30:46 -04:00
James McClure
a50d9e9aa6 add python script to generate bubble 2022-03-20 09:29:42 -04:00
James McClure
702eaae1c1 try at membrane simulator 2022-03-18 18:08:44 -04:00
Thomas Ramstad
766dfc299a Correct typo
Mask->Porosity()

	modified:   models/ColorModel.cpp
	modified:   models/MRTModel.cpp
2022-03-18 18:51:21 +01:00
Thomas Ramstad
2a5df51bb2 Add adjusted perms
Updated the SCAL.csv with eff-perm values weighted with porosity to make them closer
to measure data.

Added column in Permeability.csv with the adjusted values.

	modified:   models/ColorModel.cpp
	modified:   models/MRTModel.cpp
2022-03-18 17:04:18 +01:00
James McClure
c94d4e5194 membrane structure in IonModel 2022-03-18 11:23:01 -04:00
James McClure
bfe1de6be2 membrane structure in IonModel 2022-03-18 11:20:09 -04:00
James McClure
bfa6f5e5b8 move membrane to common folder 2022-03-18 11:09:34 -04:00
James McClure
e8d0b0b48a adding membrane functions 2022-03-18 11:06:16 -04:00
James McClure
1abb9adea6 clean up relabel 2022-03-18 11:05:50 -04:00
trams@equinor.com
fe2496ebdd Add film corrected eff perms
modified:   models/ColorModel.cpp
2022-03-18 14:54:47 +01:00
zherexli
ece735e0e7 fix dumb print bug 2022-03-18 14:29:29 +11:00
zherexli
0105155d2f update comments in ion model 2022-03-17 11:35:48 +11:00
James McClure
12026f54d4 update to membrane capability 2022-03-14 16:16:47 -04:00
James McClure
abf5823de6 membrane data structures compiling 2022-03-11 05:39:34 -05:00
James E McClure
d1bbc2171f Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA 2022-03-08 22:14:55 -05:00
James E McClure
ecb882bc75 update crusher build 2022-03-08 22:14:42 -05:00
James McClure
91cf379e86 poking at MF issue 2022-03-08 22:13:48 -05:00
James E McClure
39feb20ec9 add membrane unpack function 2022-03-07 16:47:19 -05:00
James E McClure
f270604e6b add membrane transport function for d3q7 2022-03-07 15:36:16 -05:00
James E McClure
3065ce1f28 Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA 2022-03-06 00:22:50 -05:00
James E McClure
77eba8c48a Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA 2022-03-05 07:47:22 -05:00
James McClure
00d1f36867 Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA 2022-03-04 19:05:14 -05:00
James McClure
b39dd1a474 make sure halo is filled when measuring object 2022-03-04 18:59:11 -05:00
James McClure
dd9da41d4b added cell simulator 2022-02-24 21:43:18 -05:00
James McClure
896e6ee27b add cell simulator 2022-02-21 16:49:42 -05:00
James McClure
5bbda39fdb working on membrane communication structures 2022-02-21 16:39:24 -05:00
James E McClure
033477ee95 Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-12 01:49:10 -05:00
James E McClure
101f3a8f44 crusher hackathon final version 2022-02-12 01:39:30 -05:00
Mark Berrill
61e0714595 Updating calculation of bandwidth 2022-02-10 16:47:37 -05:00
James E McClure
074ea746f2 Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-10 16:41:34 -05:00
Mark Berrill
1f671edbc1 Testing persistent communication 2022-02-10 16:29:22 -05:00
James E McClure
1378b1f13b Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-10 15:41:43 -05:00
James E McClure
c672648b7d fix sendcount / recvcount 2022-02-10 15:03:57 -05:00
Mark Berrill
f329e424a4 Removing TestCrusher tests, the bug deals with the StackTrace which we disable the multistack trace for now. Moving the test out of LBPM 2022-02-10 14:04:27 -05:00
Mark Berrill
7787f9e8f1 Created standalone MPI failure test 2022-02-10 14:02:40 -05:00
James E McClure
e657971ee0 Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-10 13:43:41 -05:00
James E McClure
6b0b8daddd hang fix / workaround 2022-02-10 13:43:31 -05:00
Mark Berrill
1c63baea09 Still trying to create standalone reproducer 2022-02-10 12:16:25 -05:00
Mark Berrill
86834d5b99 Getting closer to stand alone test 2022-02-10 12:06:44 -05:00
Mark Berrill
ad9f68d9fb Working on standalone MPI test on crusher 2022-02-10 11:53:20 -05:00
Mark Berrill
51667854c1 Working on standalone MPI test on crusher 2022-02-10 11:46:17 -05:00
Mark Berrill
7a6654de60 Reverting TestCrusher2, standalone version passes (TestCrusher3.cpp), need to figure out why 2022-02-09 18:01:34 -05:00
Mark Berrill
9affb84db3 More work on creating standalone test 2022-02-09 17:54:28 -05:00
Mark Berrill
9d7fccab63 More work on creating standalone test 2022-02-09 17:52:05 -05:00
Mark Berrill
e21125901b More work on creating standalone test 2022-02-09 17:40:09 -05:00
Mark Berrill
736ab1c101 Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-09 17:01:05 -05:00
Mark Berrill
8c448376d9 Working on standalone reproducer for MPI bug 2022-02-09 17:01:00 -05:00
James E McClure
e2f198759d Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-09 16:48:08 -05:00
James E McClure
c35867ad3a cpu test for crusher 2022-02-09 16:47:59 -05:00
James E McClure
9266b66108 updates to crusher configure 2022-02-09 16:42:03 -05:00
James E McClure
d1a45a3b1e disable MPI thread multiple for crusher 2022-02-09 16:40:30 -05:00
Mark Berrill
2237df73eb Adding new crusher MPI test 2022-02-09 16:36:09 -05:00
James E McClure
05712e6a16 Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-09 16:11:34 -05:00
Mark Berrill
6b1eee3951 Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-09 16:11:10 -05:00
Mark Berrill
43f18cb4fd Adding simple crusher test 2022-02-09 16:11:04 -05:00
James E McClure
9a2e71c53d working on fix 2022-02-09 14:06:49 -05:00
James E McClure
373bdfb030 Merge branch 'Crusher' of github.com:JamesEMcClure/LBPM-WIA into Crusher 2022-02-09 12:41:21 -05:00
James E McClure
3c14c0cb89 fix binding in example 2022-02-09 12:41:10 -05:00
Mark Berrill
6b36b82ccf Fixing some quick annoying compile warnings 2022-02-09 12:03:08 -05:00
James E McClure
7919de40f6 add dense case for crusher 2022-02-09 11:37:18 -05:00
James E McClure
b843af8e5e add crusher examples 2022-02-08 22:36:28 -05:00
James E McClure
3948bcb9d8 update crusher config 2022-02-07 15:55:51 -05:00
James E McClure
88c16cf92b summit script with specific module versions 2022-02-03 13:17:56 -05:00
James E McClure
1f6d37208e fix whitespace merge issue 2022-02-03 09:15:28 -05:00
James E McClure
3adbd04d38 update color 2022-02-03 09:14:33 -05:00
James E McClure
a181dfc85d tune launch for mrt on crusher 2022-02-02 11:15:00 -05:00
James E McClure
a7296535b0 tune launch for crusher 2022-02-02 10:39:09 -05:00
James E McClure
4ebfeba4d2 update summit config 2022-02-02 05:51:36 -05:00
Rex Zhe Li
d8ac0cc043 fix dumb typo 2022-01-31 00:30:37 -05:00
Zhe Rex Li
ba14aecf35 try to fix time conversion factor for Poisson solver; to be built and tested 2022-01-31 12:23:36 +11:00
James McClure
eccebcd95a Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA 2022-01-28 06:43:00 -05:00
James McClure
1e4a68aab9 add film term to scal.csv 2022-01-28 06:42:52 -05:00
James McClure
7d679571f1 working on membrane comm 2022-01-28 05:16:37 -05:00
Zhe Rex Li
ea90e9f875 Merge branch 'master' into slipping_vel_debug 2022-01-28 16:29:30 +11:00
Li Rex
71fdedaa2f clean up code for the updated slipping vel BC 2022-01-28 16:26:34 +11:00
Li Rex
02932e26c5 update slipping vel bc also in HIP version 2022-01-27 16:49:05 +11:00
Li Rex
d811727958 udpate cuda with corrected slipping vel BC 2022-01-27 14:44:41 +11:00
James McClure
f49b281d29 add initial radius option to morphdrain 2022-01-26 12:41:58 -05:00
James E McClure
e68e1f4dfa Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA 2022-01-26 05:05:15 -05:00
James McClure
66fc855608 convention for inside / outside membrane link direction 2022-01-19 16:18:29 -05:00
Rex Zhe Li
169a102f6c fix a few typos and bug and build passed 2022-01-18 01:15:35 -05:00
Rex Zhe Li
ee527a42cc update Poisson and Stoke solvers to include slipping velocity BC; to be built 2022-01-18 16:46:09 +11:00
James McClure
6a62500a0c update membrane test 2022-01-14 10:59:04 -05:00
James McClure
07b8f9ba36 update arden script 2022-01-13 11:18:34 -05:00
James McClure
35bea97b2e update morph test based on critical radius adjustment 2022-01-12 16:07:59 -05:00
James McClure
30e0a4e24b fix bug in morphopen 2022-01-12 14:31:12 -05:00
James McClure
0fb6fbfd7a update membrane test 2022-01-12 09:48:04 -05:00
James E McClure
af7ec72622 Merge branch 'Spock' of github.com:JamesEMcClure/LBPM-WIA into Spock 2022-01-10 15:46:16 -05:00
James E McClure
2c833ad412 add configure script for crusher 2022-01-10 15:46:06 -05:00
James E McClure
d1acb816ba update summit config 2022-01-10 15:00:50 -05:00
James E McClure
b2cfa242e9 update summit config 2022-01-07 10:50:17 -05:00
Mark Berrill
669f941682 Updating hip port to match cuda 2022-01-04 10:26:16 -05:00
James McClure
ec76dc9a3b membrane test 2021-12-28 17:21:11 -05:00
James McClure
74bf7dbec4 membrane compiles 2021-12-28 16:59:44 -05:00
James McClure
6d82dc83a2 add membrane unit test 2021-12-28 16:12:49 -05:00
James McClure
e491327c89 working on new comm data structures 2021-12-28 11:09:00 -05:00
James McClure
efd29363b6 membrane communications 2021-12-27 19:47:49 -05:00
James McClure
ce8498a9ce add pubs 2021-12-25 18:06:04 -05:00
James McClure
26766ef69a update membrane structure 2021-12-25 17:58:53 -05:00
James McClure
d59c78b0ce add membrane class 2021-12-25 17:03:35 -05:00
James McClure
f93a4d3bba update to greyscale 2021-12-24 11:52:54 -05:00
James E McClure
5b33e96984 update build script for summit 2021-12-21 15:49:11 -05:00
James E McClure
7523114937 merge color profiling 2021-12-21 11:38:16 -05:00
James E McClure
419de4e397 local spock changes 2021-12-21 11:36:52 -05:00
James McClure
bb4ce1aa09 fix string with gcc 10.2 2021-12-16 20:35:06 -05:00
James McClure
3297fd04f6 example cuda - hip conversion 2021-12-16 14:18:30 -05:00
James McClure
5c3a149ab6 add droplet input file 2021-12-16 12:07:16 -05:00
James McClure
5c794e0bd0 fix sign on geodesic curvature 2021-12-15 07:35:10 -05:00
James McClure
d259434e5f fix compiler warning 2021-12-14 16:47:27 -05:00
James McClure
66be77aeae update deficit curvature test 2021-12-14 16:46:00 -05:00
James McClure
012c1814c6 contact angle + deficit curvature tools 2021-12-14 07:07:27 -05:00
James McClure
011b2f8a87 added analysis tools for TwoPhase object 2021-12-13 16:04:20 -05:00
James McClure
93f99057bb added droplet examples 2021-12-13 11:02:17 -05:00
James McClure
fb99e78815 add docx 2021-12-12 13:16:01 -05:00
James McClure
499b765d64 Merge branch 'master' into opm 2021-12-12 11:42:44 -05:00
James McClure
bb51c75692 update sample script 2021-12-12 11:42:32 -05:00
James McClure
aa0b3a1cb0 merge LBPM with spock 2021-12-12 11:36:47 -05:00
James McClure
44a7653c60 checkout right color model 2021-12-10 12:19:08 -05:00
James McClure
90c3513e09 fix merge 2021-12-09 15:38:54 -05:00
James McClure
f0042bafea update ubuntu sample script 2021-12-09 14:02:32 -05:00
James McClure
ba67d29e5c fix memory leak 2021-12-09 14:01:27 -05:00
James McClure
51ada19b06 merging conflicts 2021-12-09 13:51:35 -05:00
James McClure
ee93851281 Merge branch 'master' of github.com:JamesEMcClure/LBPM-WIA 2021-12-09 10:17:13 -05:00
James McClure
a65ceef7d5 try 2 2021-12-09 10:16:47 -05:00
James McClure
44965c17f3 fix merge 2021-12-09 10:11:05 -05:00
Rex Zhe Li
d61cb8571f fix dumb typo 2021-12-06 00:46:44 -05:00
Rex Zhe Li
ba88a78afb update periodic potential BC for inlet and outlet;to be built and tested 2021-12-06 16:28:08 +11:00
Rex Zhe Li
8fce93fc47 update solid BCs of Poisson solver in GPU;to be built and tested 2021-12-06 13:50:17 +11:00
JamesEMcClure
8800066c06 Add "SCAL.csv" file for color model (#61)
* add scal file

* SCAL csv file

* Update ColorModel.cpp

Updated the SCAL output file, and removed some entries.

Co-authored-by: Thomas Ramstad <trams@equinor.com>
2021-12-03 14:53:36 +01:00
Rex Zhe Li
95d01c4e28 fix bug 2021-12-02 07:47:49 -05:00
Rex Zhe Li
7d525e999b enable different types of solid BC for Poisson solver; to be built and tested 2021-12-02 23:32:57 +11:00
James McClure
e6fa7d4065 add SCAL file 2021-12-01 08:08:16 -05:00
JamesEMcClure
b63ad4912a Merge pull request #57 from OPM/thomaram-patch-2
Update c-cpp.yml
2021-11-11 07:08:45 -05:00
Mark Berrill
db323d8e91 Merge remote-tracking branch 'origin/master' into Spock 2021-11-03 11:06:15 -04:00
Mark Berrill
056eeaa461 Merging master 2021-11-03 10:55:56 -04:00
Mark Berrill
7dad69af9a Adding script 2021-10-08 14:28:41 -04:00
James E McClure
d5ab9c0a70 Merge branch 'Spock' of github.com:JamesEMcClure/LBPM-WIA into Spock 2021-08-02 13:55:14 -04:00
James E McClure
50569aed48 fixing gpu mpi sync 2021-08-02 13:55:10 -04:00
Mark Berrill
a55a030c3c Updating build to support HIP as a language 2021-08-02 13:54:30 -04:00
James E McClure
3f20276cb6 add barrier between pack / MPI send 2021-07-26 11:48:05 -04:00
Mark Berrill
f6690d2277 Cleaning up some compiler warnings 2021-07-19 13:36:04 -04:00
James E McClure
cd96365cd1 save state 2021-07-19 13:17:39 -04:00
James E McClure
147f0b9d15 added gpu wrapper test 2021-07-02 11:21:25 -04:00
James E McClure
288d62b824 Merge branch 'Spock' of github.com:JamesEMcClure/LBPM-WIA into Spock 2021-07-02 10:25:54 -04:00
JamesEMcclure
2ea3d8d491 playing with spock Gpu test 2021-07-02 10:25:38 -04:00
James E McClure
6838cabbaf Merge branch 'Spock' of github.com:JamesEMcClure/LBPM-WIA into Spock 2021-07-01 14:28:47 -04:00
Mark Berrill
1d07c1f860 Modifying GPU aware MPI test to send multiple messages 2021-06-28 13:50:45 -04:00
James E McClure
5fabda67c5 Merge branch 'Spock' of github.com:JamesEMcClure/LBPM-WIA into Spock 2021-06-25 16:50:59 -04:00
James E McClure
51ee286951 some changes to configure for spock 2021-06-25 16:50:54 -04:00
Mark Berrill
d6b0f45710 Adding simple test of GPU aware MPI 2021-06-24 16:12:23 -04:00
Mark Berrill
d9746d575b Fixing potential bugs with communication 2021-06-24 14:42:03 -04:00
James E McClure
ec4f6fedac update configure & test scripts for spock 2021-06-18 17:17:52 -04:00
James E McClure
611bd1c30c update spock build based on olcf docs 2021-06-18 17:04:18 -04:00
James E McClure
9ae7d78f2d testing communication on spock 2021-06-16 20:05:14 -04:00
James E McClure
fc4d79ca9f add hip slipping bc 2021-06-16 16:58:55 -04:00
James E McClure
098ceae2c8 get new models to build with hip 2021-06-16 16:36:53 -04:00
James E McClure
399191f8dc Merge branch 'master' into FOM 2021-06-16 16:24:13 -04:00
James E McClure
00c19e8f96 add spock scripts to FOM 2021-06-16 15:13:34 -04:00
James E McClure
d654703710 Merge branch 'FOM' of github.com:JamesEMcClure/LBPM-WIA into FOM 2021-05-12 13:21:11 -04:00
James E McClure
1068e19b37 Merge branch 'FOM' of github.com:JamesEMcClure/LBPM-WIA into FOM 2021-03-25 17:31:46 -04:00
James E McClure
8a55ae8d6a summit configure 2021-03-25 17:31:43 -04:00
164 changed files with 48046 additions and 8149 deletions

View File

@@ -16,8 +16,7 @@ jobs:
LBPM_SILO_DIR: /home/runner/extlib/silo
MPI_DIR: /home/runner/.openmpi
steps:
steps:
- name: download dependencies
run: |
echo $LBPM_ZLIB_DIR
@@ -29,7 +28,7 @@ jobs:
sudo apt-get update -y
wget https://bitbucket.org/AdvancedMultiPhysics/tpl-builder/downloads/silo-4.10.2.tar.gz
wget https://www.zlib.net/zlib-1.2.11.tar.gz
wget https://www.zlib.net/fossils/zlib-1.2.11.tar.gz
wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8/hdf5-1.8.12/src/hdf5-1.8.12.tar.gz
#wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8/hdf5-1.8.10/src/hdf5-1.8.10.tar.gz

View File

@@ -119,27 +119,16 @@ ADD_DISTCLEAN( analysis null_timer tests liblbpm-wia.* cpu gpu cuda hip example
# Check for CUDA
CHECK_ENABLE_FLAG( USE_CUDA 0 )
CHECK_ENABLE_FLAG( USE_HIP 0 )
CHECK_ENABLE_FLAG( USE_SYCL 0 )
NULL_USE( CMAKE_CUDA_FLAGS )
IF ( USE_CUDA )
ADD_DEFINITIONS( -DUSE_CUDA )
ENABLE_LANGUAGE( CUDA )
ELSEIF ( USE_HIP )
IF ( NOT DEFINED HIP_PATH )
IF ( NOT DEFINED ENV{HIP_PATH} )
SET( HIP_PATH "/opt/rocm/hip" CACHE PATH "Path to which HIP has been installed" )
ELSE()
SET( HIP_PATH $ENV{HIP_PATH} CACHE PATH "Path to which HIP has been installed" )
ENDIF()
ENDIF()
SET( CMAKE_MODULE_PATH "${HIP_PATH}/cmake" ${CMAKE_MODULE_PATH} )
FIND_PACKAGE( HIP REQUIRED )
FIND_PACKAGE( CUDA QUIET )
MESSAGE( "HIP Found")
MESSAGE( " HIP version: ${HIP_VERSION_STRING}")
MESSAGE( " HIP platform: ${HIP_PLATFORM}")
MESSAGE( " HIP Include Path: ${HIP_INCLUDE_DIRS}")
MESSAGE( " HIP Libraries: ${HIP_LIBRARIES}")
ENABLE_LANGUAGE( HIP )
ADD_DEFINITIONS( -DUSE_HIP )
ELSEIF ( USE_SYCL )
ADD_DEFINITIONS( -DUSE_SYCL )
ENDIF()
@@ -180,8 +169,9 @@ IF ( NOT ONLY_BUILD_DOCS )
IF ( USE_CUDA )
ADD_PACKAGE_SUBDIRECTORY( cuda )
ELSEIF ( USE_HIP )
ADD_SUBDIRECTORY( hip )
SET( LBPM_LIBRARIES lbpm-hip lbpm-wia )
ADD_PACKAGE_SUBDIRECTORY( hip )
ELSEIF ( USE_SYCL )
ADD_PACKAGE_SUBDIRECTORY( sycl )
ELSE()
ADD_PACKAGE_SUBDIRECTORY( cpu )
ENDIF()
@@ -190,5 +180,6 @@ IF ( NOT ONLY_BUILD_DOCS )
ADD_SUBDIRECTORY( example )
#ADD_SUBDIRECTORY( workflows )
INSTALL_PROJ_LIB()
CONFIGURE_FILE( ${CMAKE_CURRENT_SOURCE_DIR}/ValgrindSuppresionFile ${CMAKE_CURRENT_BINARY_DIR}/test/ValgrindSuppresionFile COPYONLY )
ENDIF()

View File

@@ -1,6 +1,5 @@
#include "IO/PackData.h"
#include <string.h>
#include <string>
/********************************************************

View File

@@ -5,7 +5,7 @@
#include <map>
#include <set>
#include <vector>
#include <cstddef>
//! Template function to return the buffer size required to pack a class
template<class TYPE>

1695
Makefile_sycl.dpct Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1263,7 +1263,7 @@ static int backtrace_thread(
if ( tid == pthread_self() ) {
count = ::backtrace( buffer, size );
} else {
// Note: this will get the backtrace, but terminates the thread in the process!!!
// Send a signal to the desired thread to get the call stack
StackTrace_mutex.lock();
struct sigaction sa;
sigfillset( &sa.sa_mask );

View File

@@ -1,52 +1,225 @@
# ACML suppressions
# To run valgrind:
# mpirun -np 2 valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes --verbose --suppressions=ValgrindSuppresionFile --log-file=valgrind-out.txt ./lbpm_nernst_planck_cell_simulator test.db
# MPI supressions
{
IdentifyCPUCond
MPI_init_cond
Memcheck:Cond
...
fun:acmlcpuid2
fun:PMPI_Init
...
}
{
IdentifyCPUValue
MPI_init_value
Memcheck:Value8
...
fun:acmlcpuid_once
fun:acmlcpuid2
fun:PMPI_Init
...
}
# MPI suppressions
{
HYD_pmci_wait_for_completion
Memcheck:Leak
...
fun:HYD_pmci_wait_for_completion
fun:main
}
{
HYDT_dmxu_poll_wait_for_event
Memcheck:Leak
...
fun:HYDT_dmxu_poll_wait_for_event
fun:main
}
{
PMPI_Init
Memcheck:Leak
MPI_init_addr16
Memcheck:Addr16
...
fun:PMPI_Init
fun:main
...
}
{
MPI_init_addr8
Memcheck:Addr8
...
fun:PMPI_Init
...
}
{
MPI_init_addr4
Memcheck:Addr4
...
fun:PMPI_Init
...
}
{
MPI_init_addr1
Memcheck:Addr1
...
fun:PMPI_Init
...
}
{
gethostname_cond
Memcheck:Cond
...
fun:gethostbyname_r
fun:gethostbyname
...
}
{
gethostname_value
Memcheck:Value8
...
fun:gethostbyname_r
fun:gethostbyname
...
}
# System suppressions
# System errors
{
map_doit_memory
Memcheck:Cond
fun:index
fun:expand_dynamic_string_token
fun:_dl_map_object
fun:map_doit
fun:_dl_catch_error
...
}
{
expand_dynamic_string_token
Memcheck:Cond
fun:index
fun:expand_dynamic_string_token
...
fun:dl_main
fun:_dl_sysdep_start
fun:_dl_start
...
}
{
call_init
Memcheck:Leak
match-leak-kinds: reachable
...
fun:call_init
fun:_dl_init
...
}
# pthread errors
{
pthread_initialize_param
Memcheck:Param
set_robust_list(head)
fun:__pthread_initialize_minimal
fun:(below main)
}
{
pthread_initialize_cond
Memcheck:Cond
fun:__register_atfork
fun:__libc_pthread_init
fun:__pthread_initialize_minimal
fun:(below main)
}
# gfortran
{
gfortran_leak
Memcheck:Leak
match-leak-kinds: reachable
fun:malloc
obj:/usr/lib/x86_64-linux-gnu/libgfortran.so.3.0.0
...
}
# std
{
libc_cond
Memcheck:Cond
...
fun:_dl_init_paths
fun:_dl_non_dynamic_init
fun:__libc_init_first
fun:(below main)
}
{
libc_val8
Memcheck:Value8
...
fun:_dl_init_paths
fun:_dl_non_dynamic_init
fun:__libc_init_first
fun:(below main)
}
{
mallinfo_cond
Memcheck:Cond
fun:int_mallinfo
fun:mallinfo
...
}
{
mallinfo_value
Memcheck:Value8
fun:int_mallinfo
fun:mallinfo
...
}
{
int_free_cond
Memcheck:Cond
fun:_int_free
...
}
{
string_len_cond
Memcheck:Cond
fun:strlen
...
}
{
int_malloc_cond
Memcheck:Cond
fun:_int_malloc
fun:malloc
...
}
{
malloc_consolidate_malloc
Memcheck:Cond
fun:malloc_consolidate
fun:_int_malloc
...
}
{
malloc_consolidate_free
Memcheck:Cond
fun:malloc_consolidate
fun:_int_free
...
}
{
catch_cond
Memcheck:Cond
fun:__cxa_begin_catch
...
}
{
popen
Memcheck:Param
set_robust_list(head)
fun:__nptl_set_robust
fun:__libc_fork
fun:_IO_proc_open
fun:popen
...
}
{
exit
Memcheck:Value8
fun:__run_exit_handlers
...
fun:exit
...
}
{
sse42
Memcheck:Cond
fun:__strstr_sse42
...
}

View File

@@ -49,7 +49,7 @@ ElectroChemistryAnalyzer::ElectroChemistryAnalyzer(std::shared_ptr<Domain> dm)
IonFluxElectrical_y.fill(0);
IonFluxElectrical_z.resize(Nx, Ny, Nz);
IonFluxElectrical_z.fill(0);
if (Dm->rank() == 0) {
bool WriteHeader = false;
TIMELOG = fopen("electrokinetic.csv", "r");
@@ -67,6 +67,87 @@ ElectroChemistryAnalyzer::ElectroChemistryAnalyzer(std::shared_ptr<Domain> dm)
}
}
ElectroChemistryAnalyzer::ElectroChemistryAnalyzer(ScaLBL_IonModel &IonModel)
: Dm(IonModel.Dm) {
Nx = Dm->Nx;
Ny = Dm->Ny;
Nz = Dm->Nz;
Volume = (Nx - 2) * (Ny - 2) * (Nz - 2) * Dm->nprocx() * Dm->nprocy() *
Dm->nprocz() * 1.0;
if (Dm->rank()==0) printf("Analyze system with sub-domain size = %i x %i x %i \n",Nx,Ny,Nz);
USE_MEMBRANE = IonModel.USE_MEMBRANE;
ChemicalPotential.resize(Nx, Ny, Nz);
ChemicalPotential.fill(0);
ElectricalPotential.resize(Nx, Ny, Nz);
ElectricalPotential.fill(0);
ElectricalField_x.resize(Nx, Ny, Nz);
ElectricalField_x.fill(0);
ElectricalField_y.resize(Nx, Ny, Nz);
ElectricalField_y.fill(0);
ElectricalField_z.resize(Nx, Ny, Nz);
ElectricalField_z.fill(0);
Pressure.resize(Nx, Ny, Nz);
Pressure.fill(0);
Rho.resize(Nx, Ny, Nz);
Rho.fill(0);
Vel_x.resize(Nx, Ny, Nz);
Vel_x.fill(0); // Gradient of the phase indicator field
Vel_y.resize(Nx, Ny, Nz);
Vel_y.fill(0);
Vel_z.resize(Nx, Ny, Nz);
Vel_z.fill(0);
SDs.resize(Nx, Ny, Nz);
SDs.fill(0);
IonFluxDiffusive_x.resize(Nx, Ny, Nz);
IonFluxDiffusive_x.fill(0);
IonFluxDiffusive_y.resize(Nx, Ny, Nz);
IonFluxDiffusive_y.fill(0);
IonFluxDiffusive_z.resize(Nx, Ny, Nz);
IonFluxDiffusive_z.fill(0);
IonFluxAdvective_x.resize(Nx, Ny, Nz);
IonFluxAdvective_x.fill(0);
IonFluxAdvective_y.resize(Nx, Ny, Nz);
IonFluxAdvective_y.fill(0);
IonFluxAdvective_z.resize(Nx, Ny, Nz);
IonFluxAdvective_z.fill(0);
IonFluxElectrical_x.resize(Nx, Ny, Nz);
IonFluxElectrical_x.fill(0);
IonFluxElectrical_y.resize(Nx, Ny, Nz);
IonFluxElectrical_y.fill(0);
IonFluxElectrical_z.resize(Nx, Ny, Nz);
IonFluxElectrical_z.fill(0);
if (Dm->rank() == 0) {
printf("Set up analysis routines for %lu ions \n",IonModel.number_ion_species);
bool WriteHeader = false;
TIMELOG = fopen("electrokinetic.csv", "r");
if (TIMELOG != NULL)
fclose(TIMELOG);
else
WriteHeader = true;
TIMELOG = fopen("electrokinetic.csv", "a+");
if (WriteHeader) {
// If timelog is empty, write a short header to list the averages
//fprintf(TIMELOG,"--------------------------------------------------------------------------------------\n");
fprintf(TIMELOG, "timestep voltage_out voltage_in ");
fprintf(TIMELOG, "voltage_out_membrane voltage_in_membrane ");
for (size_t i=0; i<IonModel.number_ion_species; i++){
fprintf(TIMELOG, "rho_%lu_out rho_%lu_in ",i, i);
fprintf(TIMELOG, "rho_%lu_out_membrane rho_%lu_in_membrane ", i, i);
}
fprintf(TIMELOG, "count_out count_in ");
fprintf(TIMELOG, "count_out_membrane count_in_membrane\n");
}
}
}
ElectroChemistryAnalyzer::~ElectroChemistryAnalyzer() {
if (Dm->rank() == 0) {
fclose(TIMELOG);
@@ -75,6 +156,163 @@ ElectroChemistryAnalyzer::~ElectroChemistryAnalyzer() {
void ElectroChemistryAnalyzer::SetParams() {}
void ElectroChemistryAnalyzer::Membrane(ScaLBL_IonModel &Ion,
ScaLBL_Poisson &Poisson,
int timestep) {
int i, j, k;
Poisson.getElectricPotential(ElectricalPotential);
if (Dm->rank() == 0)
fprintf(TIMELOG, "%i ", timestep);
/* int iq, ip, nq, np, nqm, npm;
Ion.MembraneDistance(i,j,k); // inside (-) or outside (+) the ion
for (int link; link<Ion.IonMembrane->membraneLinkCount; link++){
int iq = Ion.IonMembrane->membraneLinks[2*link];
int ip = Ion.IonMembrane->membraneLinks[2*link+1];
iq = membrane[2*link]; ip = membrane[2*link+1];
nq = iq%Np; np = ip%Np;
nqm = Map[nq]; npm = Map[np];
}
*/
unsigned long int in_local_count, out_local_count;
unsigned long int in_global_count, out_global_count;
double value_in_local, value_out_local;
double value_in_global, value_out_global;
double value_membrane_in_local, value_membrane_out_local;
double value_membrane_in_global, value_membrane_out_global;
unsigned long int membrane_in_local_count, membrane_out_local_count;
unsigned long int membrane_in_global_count, membrane_out_global_count;
double memdist,value;
in_local_count = 0;
out_local_count = 0;
membrane_in_local_count = 0;
membrane_out_local_count = 0;
value_membrane_in_local = 0.0;
value_membrane_out_local = 0.0;
value_in_local = 0.0;
value_out_local = 0.0;
for (k = 1; k < Nz; k++) {
for (j = 1; j < Ny; j++) {
for (i = 1; i < Nx; i++) {
/* electric potential */
memdist = Ion.MembraneDistance(i,j,k);
value = ElectricalPotential(i,j,k);
if (memdist < 0.0){
// inside the membrane
if (fabs(memdist) < 1.0){
value_membrane_in_local += value;
membrane_in_local_count++;
}
value_in_local += value;
in_local_count++;
}
else {
// outside the membrane
if (fabs(memdist) < 1.0){
value_membrane_out_local += value;
membrane_out_local_count++;
}
value_out_local += value;
out_local_count++;
}
}
}
}
/* these only need to be computed the first time through */
out_global_count = Dm->Comm.sumReduce(out_local_count);
in_global_count = Dm->Comm.sumReduce(in_local_count);
membrane_out_global_count = Dm->Comm.sumReduce(membrane_out_local_count);
membrane_in_global_count = Dm->Comm.sumReduce(membrane_in_local_count);
value_out_global = Dm->Comm.sumReduce(value_out_local);
value_in_global = Dm->Comm.sumReduce(value_in_local);
value_membrane_out_global = Dm->Comm.sumReduce(value_membrane_out_local);
value_membrane_in_global = Dm->Comm.sumReduce(value_membrane_in_local);
value_out_global /= out_global_count;
value_in_global /= in_global_count;
value_membrane_out_global /= membrane_out_global_count;
value_membrane_in_global /= membrane_in_global_count;
if (Dm->rank() == 0) {
fprintf(TIMELOG, "%.8g ", value_out_global);
fprintf(TIMELOG, "%.8g ", value_in_global);
fprintf(TIMELOG, "%.8g ", value_membrane_out_global);
fprintf(TIMELOG, "%.8g ", value_membrane_in_global);
}
value_membrane_in_local = 0.0;
value_membrane_out_local = 0.0;
value_in_local = 0.0;
value_out_local = 0.0;
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
Ion.getIonConcentration(Rho, ion);
value_membrane_in_local = 0.0;
value_membrane_out_local = 0.0;
value_in_local = 0.0;
value_out_local = 0.0;
for (k = 1; k < Nz; k++) {
for (j = 1; j < Ny; j++) {
for (i = 1; i < Nx; i++) {
/* electric potential */
memdist = Ion.MembraneDistance(i,j,k);
value = Rho(i,j,k);
if (memdist < 0.0){
// inside the membrane
if (fabs(memdist) < 1.0){
value_membrane_in_local += value;
}
value_in_local += value;
}
else {
// outside the membrane
if (fabs(memdist) < 1.0){
value_membrane_out_local += value;
}
value_out_local += value;
}
}
}
}
value_out_global = Dm->Comm.sumReduce(value_out_local);
value_in_global = Dm->Comm.sumReduce(value_in_local);
value_membrane_out_global = Dm->Comm.sumReduce(value_membrane_out_local);
value_membrane_in_global = Dm->Comm.sumReduce(value_membrane_in_local);
value_out_global /= out_global_count;
value_in_global /= in_global_count;
value_membrane_out_global /= membrane_out_global_count;
value_membrane_in_global /= membrane_in_global_count;
if (Dm->rank() == 0) {
fprintf(TIMELOG, "%.8g ", value_out_global);
fprintf(TIMELOG, "%.8g ", value_in_global);
fprintf(TIMELOG, "%.8g ", value_membrane_out_global);
fprintf(TIMELOG, "%.8g ", value_membrane_in_global);
}
}
if (Dm->rank() == 0) {
fprintf(TIMELOG, "%lu ", out_global_count);
fprintf(TIMELOG, "%lu ", in_global_count);
fprintf(TIMELOG, "%lu ", membrane_out_global_count);
fprintf(TIMELOG, "%lu\n", membrane_in_global_count);
fflush(TIMELOG);
}
}
void ElectroChemistryAnalyzer::Basic(ScaLBL_IonModel &Ion,
ScaLBL_Poisson &Poisson,
ScaLBL_StokesModel &Stokes, int timestep) {
@@ -595,3 +833,408 @@ void ElectroChemistryAnalyzer::WriteVis(ScaLBL_IonModel &Ion,
}
*/
}
void ElectroChemistryAnalyzer::Basic(ScaLBL_IonModel &Ion,
ScaLBL_Poisson &Poisson,
int timestep) {
int i, j, k;
double Vin = 0.0;
double Vout = 0.0;
Poisson.getElectricPotential(ElectricalPotential);
/* local sub-domain averages */
double *rho_avg_local;
double *rho_mu_avg_local;
double *rho_mu_fluctuation_local;
double *rho_psi_avg_local;
double *rho_psi_fluctuation_local;
/* global averages */
double *rho_avg_global;
double *rho_mu_avg_global;
double *rho_mu_fluctuation_global;
double *rho_psi_avg_global;
double *rho_psi_fluctuation_global;
/* Get the distance to the membrane */
if (Ion.USE_MEMBRANE){
//Ion.MembraneDistance;
}
/* local sub-domain averages */
rho_avg_local = new double[Ion.number_ion_species];
rho_mu_avg_local = new double[Ion.number_ion_species];
rho_mu_fluctuation_local = new double[Ion.number_ion_species];
rho_psi_avg_local = new double[Ion.number_ion_species];
rho_psi_fluctuation_local = new double[Ion.number_ion_species];
/* global averages */
rho_avg_global = new double[Ion.number_ion_species];
rho_mu_avg_global = new double[Ion.number_ion_species];
rho_mu_fluctuation_global = new double[Ion.number_ion_species];
rho_psi_avg_global = new double[Ion.number_ion_species];
rho_psi_fluctuation_global = new double[Ion.number_ion_species];
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
rho_avg_local[ion] = 0.0;
rho_mu_avg_local[ion] = 0.0;
rho_psi_avg_local[ion] = 0.0;
Ion.getIonConcentration(Rho, ion);
/* Compute averages for each ion */
for (k = 1; k < Nz; k++) {
for (j = 1; j < Ny; j++) {
for (i = 1; i < Nx; i++) {
rho_avg_local[ion] += Rho(i, j, k);
rho_mu_avg_local[ion] += Rho(i, j, k) * Rho(i, j, k);
rho_psi_avg_local[ion] +=
Rho(i, j, k) * ElectricalPotential(i, j, k);
}
}
}
rho_avg_global[ion] = Dm->Comm.sumReduce(rho_avg_local[ion]) / Volume;
rho_mu_avg_global[ion] =
Dm->Comm.sumReduce(rho_mu_avg_local[ion]) / Volume;
rho_psi_avg_global[ion] =
Dm->Comm.sumReduce(rho_psi_avg_local[ion]) / Volume;
if (rho_avg_global[ion] > 0.0) {
rho_mu_avg_global[ion] /= rho_avg_global[ion];
rho_psi_avg_global[ion] /= rho_avg_global[ion];
}
}
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
rho_mu_fluctuation_local[ion] = 0.0;
rho_psi_fluctuation_local[ion] = 0.0;
/* Compute averages for each ion */
for (k = 1; k < Nz; k++) {
for (j = 1; j < Ny; j++) {
for (i = 1; i < Nx; i++) {
rho_mu_fluctuation_local[ion] +=
(Rho(i, j, k) * Rho(i, j, k) - rho_mu_avg_global[ion]);
rho_psi_fluctuation_local[ion] +=
(Rho(i, j, k) * ElectricalPotential(i, j, k) -
rho_psi_avg_global[ion]);
}
}
}
rho_mu_fluctuation_global[ion] =
Dm->Comm.sumReduce(rho_mu_fluctuation_local[ion]);
rho_psi_fluctuation_global[ion] =
Dm->Comm.sumReduce(rho_psi_fluctuation_local[ion]);
}
if (Dm->rank() == 0) {
fprintf(TIMELOG, "%i ", timestep);
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
fprintf(TIMELOG, "%.8g ", rho_avg_global[ion]);
fprintf(TIMELOG, "%.8g ", rho_mu_avg_global[ion]);
fprintf(TIMELOG, "%.8g ", rho_psi_avg_global[ion]);
fprintf(TIMELOG, "%.8g ", rho_mu_fluctuation_global[ion]);
fprintf(TIMELOG, "%.8g ", rho_psi_fluctuation_global[ion]);
}
fprintf(TIMELOG, "%.8g %.8g\n", Vin, Vout);
fflush(TIMELOG);
}
/* else{
fprintf(TIMELOG,"%i ",timestep);
for (int ion=0; ion<Ion.number_ion_species; ion++){
fprintf(TIMELOG,"%.8g ",rho_avg_local[ion]);
fprintf(TIMELOG,"%.8g ",rho_mu_avg_local[ion]);
fprintf(TIMELOG,"%.8g ",rho_psi_avg_local[ion]);
fprintf(TIMELOG,"%.8g ",rho_mu_fluctuation_local[ion]);
fprintf(TIMELOG,"%.8g ",rho_psi_fluctuation_local[ion]);
}
fflush(TIMELOG);
} */
}
void ElectroChemistryAnalyzer::WriteVis(ScaLBL_IonModel &Ion,
ScaLBL_Poisson &Poisson,
std::shared_ptr<Database> input_db,
int timestep) {
auto vis_db = input_db->getDatabase("Visualization");
char VisName[40];
auto format = vis_db->getWithDefault<string>( "format", "hdf5" );
std::vector<IO::MeshDataStruct> visData;
fillHalo<double> fillData(Dm->Comm, Dm->rank_info,
{Dm->Nx - 2, Dm->Ny - 2, Dm->Nz - 2}, {1, 1, 1},
0, 1);
IO::initialize("",format,"false");
// Create the MeshDataStruct
visData.resize(1);
visData[0].meshName = "domain";
visData[0].mesh =
std::make_shared<IO::DomainMesh>(Dm->rank_info, Dm->Nx - 2, Dm->Ny - 2,
Dm->Nz - 2, Dm->Lx, Dm->Ly, Dm->Lz);
//electric potential
auto ElectricPotentialVar = std::make_shared<IO::Variable>();
//electric field
auto ElectricFieldVar_x = std::make_shared<IO::Variable>();
auto ElectricFieldVar_y = std::make_shared<IO::Variable>();
auto ElectricFieldVar_z = std::make_shared<IO::Variable>();
//ion concentration
std::vector<shared_ptr<IO::Variable>> IonConcentration;
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
IonConcentration.push_back(std::make_shared<IO::Variable>());
}
// diffusive ion flux
std::vector<shared_ptr<IO::Variable>> IonFluxDiffusive;
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
//push in x-,y-, and z-component for each ion species
IonFluxDiffusive.push_back(std::make_shared<IO::Variable>());
IonFluxDiffusive.push_back(std::make_shared<IO::Variable>());
IonFluxDiffusive.push_back(std::make_shared<IO::Variable>());
}
// electro-migrational ion flux
std::vector<shared_ptr<IO::Variable>> IonFluxElectrical;
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
//push in x-,y-, and z-component for each ion species
IonFluxElectrical.push_back(std::make_shared<IO::Variable>());
IonFluxElectrical.push_back(std::make_shared<IO::Variable>());
IonFluxElectrical.push_back(std::make_shared<IO::Variable>());
}
//--------------------------------------------------------------------------------------------------------------------
//-------------------------------------Create Names for Variables------------------------------------------------------
if (vis_db->getWithDefault<bool>("save_electric_potential", true)) {
ElectricPotentialVar->name = "ElectricPotential";
ElectricPotentialVar->type = IO::VariableType::VolumeVariable;
ElectricPotentialVar->dim = 1;
ElectricPotentialVar->data.resize(Dm->Nx - 2, Dm->Ny - 2, Dm->Nz - 2);
visData[0].vars.push_back(ElectricPotentialVar);
}
if (vis_db->getWithDefault<bool>("save_concentration", true)) {
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
sprintf(VisName, "IonConcentration_%zu", ion + 1);
IonConcentration[ion]->name = VisName;
IonConcentration[ion]->type = IO::VariableType::VolumeVariable;
IonConcentration[ion]->dim = 1;
IonConcentration[ion]->data.resize(Dm->Nx - 2, Dm->Ny - 2,
Dm->Nz - 2);
visData[0].vars.push_back(IonConcentration[ion]);
}
}
if (vis_db->getWithDefault<bool>("save_ion_flux_diffusive", false)) {
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
// x-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxDiffusive_x", ion + 1);
IonFluxDiffusive[3 * ion + 0]->name = VisName;
IonFluxDiffusive[3 * ion + 0]->type =
IO::VariableType::VolumeVariable;
IonFluxDiffusive[3 * ion + 0]->dim = 1;
IonFluxDiffusive[3 * ion + 0]->data.resize(Dm->Nx - 2, Dm->Ny - 2,
Dm->Nz - 2);
visData[0].vars.push_back(IonFluxDiffusive[3 * ion + 0]);
// y-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxDiffusive_y", ion + 1);
IonFluxDiffusive[3 * ion + 1]->name = VisName;
IonFluxDiffusive[3 * ion + 1]->type =
IO::VariableType::VolumeVariable;
IonFluxDiffusive[3 * ion + 1]->dim = 1;
IonFluxDiffusive[3 * ion + 1]->data.resize(Dm->Nx - 2, Dm->Ny - 2,
Dm->Nz - 2);
visData[0].vars.push_back(IonFluxDiffusive[3 * ion + 1]);
// z-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxDiffusive_z", ion + 1);
IonFluxDiffusive[3 * ion + 2]->name = VisName;
IonFluxDiffusive[3 * ion + 2]->type =
IO::VariableType::VolumeVariable;
IonFluxDiffusive[3 * ion + 2]->dim = 1;
IonFluxDiffusive[3 * ion + 2]->data.resize(Dm->Nx - 2, Dm->Ny - 2,
Dm->Nz - 2);
visData[0].vars.push_back(IonFluxDiffusive[3 * ion + 2]);
}
}
if (vis_db->getWithDefault<bool>("save_ion_flux_electrical", false)) {
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
// x-component of electro-migrational flux
sprintf(VisName, "Ion%zu_FluxElectrical_x", ion + 1);
IonFluxElectrical[3 * ion + 0]->name = VisName;
IonFluxElectrical[3 * ion + 0]->type =
IO::VariableType::VolumeVariable;
IonFluxElectrical[3 * ion + 0]->dim = 1;
IonFluxElectrical[3 * ion + 0]->data.resize(Dm->Nx - 2, Dm->Ny - 2,
Dm->Nz - 2);
visData[0].vars.push_back(IonFluxElectrical[3 * ion + 0]);
// y-component of electro-migrational flux
sprintf(VisName, "Ion%zu_FluxElectrical_y", ion + 1);
IonFluxElectrical[3 * ion + 1]->name = VisName;
IonFluxElectrical[3 * ion + 1]->type =
IO::VariableType::VolumeVariable;
IonFluxElectrical[3 * ion + 1]->dim = 1;
IonFluxElectrical[3 * ion + 1]->data.resize(Dm->Nx - 2, Dm->Ny - 2,
Dm->Nz - 2);
visData[0].vars.push_back(IonFluxElectrical[3 * ion + 1]);
// z-component of electro-migrational flux
sprintf(VisName, "Ion%zu_FluxElectrical_z", ion + 1);
IonFluxElectrical[3 * ion + 2]->name = VisName;
IonFluxElectrical[3 * ion + 2]->type =
IO::VariableType::VolumeVariable;
IonFluxElectrical[3 * ion + 2]->dim = 1;
IonFluxElectrical[3 * ion + 2]->data.resize(Dm->Nx - 2, Dm->Ny - 2,
Dm->Nz - 2);
visData[0].vars.push_back(IonFluxElectrical[3 * ion + 2]);
}
}
if (vis_db->getWithDefault<bool>("save_electric_field", false)) {
ElectricFieldVar_x->name = "ElectricField_x";
ElectricFieldVar_x->type = IO::VariableType::VolumeVariable;
ElectricFieldVar_x->dim = 1;
ElectricFieldVar_x->data.resize(Dm->Nx - 2, Dm->Ny - 2, Dm->Nz - 2);
visData[0].vars.push_back(ElectricFieldVar_x);
ElectricFieldVar_y->name = "ElectricField_y";
ElectricFieldVar_y->type = IO::VariableType::VolumeVariable;
ElectricFieldVar_y->dim = 1;
ElectricFieldVar_y->data.resize(Dm->Nx - 2, Dm->Ny - 2, Dm->Nz - 2);
visData[0].vars.push_back(ElectricFieldVar_y);
ElectricFieldVar_z->name = "ElectricField_z";
ElectricFieldVar_z->type = IO::VariableType::VolumeVariable;
ElectricFieldVar_z->dim = 1;
ElectricFieldVar_z->data.resize(Dm->Nx - 2, Dm->Ny - 2, Dm->Nz - 2);
visData[0].vars.push_back(ElectricFieldVar_z);
}
//--------------------------------------------------------------------------------------------------------------------
//------------------------------------Save All Variables--------------------------------------------------------------
if (vis_db->getWithDefault<bool>("save_electric_potential", true)) {
ASSERT(visData[0].vars[0]->name == "ElectricPotential");
Poisson.getElectricPotential(ElectricalPotential);
Array<double> &ElectricPotentialData = visData[0].vars[0]->data;
fillData.copy(ElectricalPotential, ElectricPotentialData);
}
if (vis_db->getWithDefault<bool>("save_concentration", true)) {
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
sprintf(VisName, "IonConcentration_%zu", ion + 1);
//IonConcentration[ion]->name = VisName;
ASSERT(visData[0].vars[1 + ion]->name == VisName);
Array<double> &IonConcentrationData =
visData[0].vars[1 + ion]->data;
Ion.getIonConcentration(Rho, ion);
fillData.copy(Rho, IonConcentrationData);
}
}
if (vis_db->getWithDefault<bool>("save_ion_flux_diffusive", false)) {
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
// x-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxDiffusive_x", ion + 1);
//IonFluxDiffusive[3*ion+0]->name = VisName;
ASSERT(visData[0]
.vars[4 + Ion.number_ion_species + 3 * ion + 0]
->name == VisName);
// y-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxDiffusive_y", ion + 1);
//IonFluxDiffusive[3*ion+1]->name = VisName;
ASSERT(visData[0]
.vars[4 + Ion.number_ion_species + 3 * ion + 1]
->name == VisName);
// z-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxDiffusive_z", ion + 1);
//IonFluxDiffusive[3*ion+2]->name = VisName;
ASSERT(visData[0]
.vars[4 + Ion.number_ion_species + 3 * ion + 2]
->name == VisName);
Array<double> &IonFluxData_x =
visData[0].vars[4 + Ion.number_ion_species + 3 * ion + 0]->data;
Array<double> &IonFluxData_y =
visData[0].vars[4 + Ion.number_ion_species + 3 * ion + 1]->data;
Array<double> &IonFluxData_z =
visData[0].vars[4 + Ion.number_ion_species + 3 * ion + 2]->data;
Ion.getIonFluxDiffusive(IonFluxDiffusive_x, IonFluxDiffusive_y,
IonFluxDiffusive_z, ion);
fillData.copy(IonFluxDiffusive_x, IonFluxData_x);
fillData.copy(IonFluxDiffusive_y, IonFluxData_y);
fillData.copy(IonFluxDiffusive_z, IonFluxData_z);
}
}
if (vis_db->getWithDefault<bool>("save_ion_flux_electrical", false)) {
for (size_t ion = 0; ion < Ion.number_ion_species; ion++) {
// x-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxElectrical_x", ion + 1);
//IonFluxDiffusive[3*ion+0]->name = VisName;
ASSERT(visData[0]
.vars[4 + Ion.number_ion_species * (1 + 6) + 3 * ion + 0]
->name == VisName);
// y-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxElectrical_y", ion + 1);
//IonFluxDiffusive[3*ion+1]->name = VisName;
ASSERT(visData[0]
.vars[4 + Ion.number_ion_species * (1 + 6) + 3 * ion + 1]
->name == VisName);
// z-component of diffusive flux
sprintf(VisName, "Ion%zu_FluxElectrical_z", ion + 1);
//IonFluxDiffusive[3*ion+2]->name = VisName;
ASSERT(visData[0]
.vars[4 + Ion.number_ion_species * (1 + 6) + 3 * ion + 2]
->name == VisName);
Array<double> &IonFluxData_x =
visData[0]
.vars[4 + Ion.number_ion_species * (1 + 6) + 3 * ion + 0]
->data;
Array<double> &IonFluxData_y =
visData[0]
.vars[4 + Ion.number_ion_species * (1 + 6) + 3 * ion + 1]
->data;
Array<double> &IonFluxData_z =
visData[0]
.vars[4 + Ion.number_ion_species * (1 + 6) + 3 * ion + 2]
->data;
Ion.getIonFluxElectrical(IonFluxElectrical_x, IonFluxElectrical_y,
IonFluxElectrical_z, ion);
fillData.copy(IonFluxElectrical_x, IonFluxData_x);
fillData.copy(IonFluxElectrical_y, IonFluxData_y);
fillData.copy(IonFluxElectrical_z, IonFluxData_z);
}
}
if (vis_db->getWithDefault<bool>("save_electric_field", false)) {
ASSERT(
visData[0].vars[4 + Ion.number_ion_species * (1 + 9) + 0]->name ==
"ElectricField_x");
ASSERT(
visData[0].vars[4 + Ion.number_ion_species * (1 + 9) + 1]->name ==
"ElectricField_y");
ASSERT(
visData[0].vars[4 + Ion.number_ion_species * (1 + 9) + 2]->name ==
"ElectricField_z");
Poisson.getElectricField(ElectricalField_x, ElectricalField_y,
ElectricalField_z);
Array<double> &ElectricalFieldxData =
visData[0].vars[4 + Ion.number_ion_species * (1 + 9) + 0]->data;
Array<double> &ElectricalFieldyData =
visData[0].vars[4 + Ion.number_ion_species * (1 + 9) + 1]->data;
Array<double> &ElectricalFieldzData =
visData[0].vars[4 + Ion.number_ion_species * (1 + 9) + 2]->data;
fillData.copy(ElectricalField_x, ElectricalFieldxData);
fillData.copy(ElectricalField_y, ElectricalFieldyData);
fillData.copy(ElectricalField_z, ElectricalFieldzData);
}
if (vis_db->getWithDefault<bool>("write_silo", true))
IO::writeData(timestep, visData, Dm->Comm);
//--------------------------------------------------------------------------------------------------------------------
/* if (vis_db->getWithDefault<bool>( "save_8bit_raw", true )){
char CurrentIDFilename[40];
sprintf(CurrentIDFilename,"id_t%d.raw",timestep);
Averages.AggregateLabels(CurrentIDFilename);
}
*/
}

View File

@@ -29,6 +29,8 @@ public:
double nu_n, nu_w;
double gamma_wn, beta;
double Fx, Fy, Fz;
bool USE_MEMBRANE;
//...........................................................................
int Nx, Ny, Nz;
@@ -54,13 +56,16 @@ public:
DoubleArray IonFluxElectrical_z;
ElectroChemistryAnalyzer(std::shared_ptr<Domain> Dm);
ElectroChemistryAnalyzer( ScaLBL_IonModel &IonModel);
~ElectroChemistryAnalyzer();
void SetParams();
void Basic(ScaLBL_IonModel &Ion, ScaLBL_Poisson &Poisson,
ScaLBL_StokesModel &Stokes, int timestep);
void Basic(ScaLBL_IonModel &Ion, ScaLBL_Poisson &Poisson, ScaLBL_StokesModel &Stokes, int timestep);
void Membrane(ScaLBL_IonModel &Ion, ScaLBL_Poisson &Poisson, int timestep);
void WriteVis(ScaLBL_IonModel &Ion, ScaLBL_Poisson &Poisson,
ScaLBL_StokesModel &Stokes,std::shared_ptr<Database> input_db, int timestep);
void Basic(ScaLBL_IonModel &Ion, ScaLBL_Poisson &Poisson, int timestep);
void WriteVis(ScaLBL_IonModel &Ion, ScaLBL_Poisson &Poisson,
ScaLBL_StokesModel &Stokes,
std::shared_ptr<Database> input_db, int timestep);
private:

View File

@@ -17,7 +17,9 @@ FlowAdaptor::FlowAdaptor(ScaLBL_ColorModel &M) {
phi_t.fill(0); // time derivative for the phase indicator field
}
FlowAdaptor::~FlowAdaptor() {}
FlowAdaptor::~FlowAdaptor() {
}
double FlowAdaptor::ImageInit(ScaLBL_ColorModel &M, std::string Filename) {
int rank = M.rank;
@@ -71,6 +73,7 @@ double FlowAdaptor::ImageInit(ScaLBL_ColorModel &M, std::string Filename) {
ScaLBL_CopyToHost(M.Averages->Phi.data(), M.Phi,
Nx * Ny * Nz * sizeof(double));
delete PhaseLabel;
double saturation = Count / PoreCount;
return saturation;
}
@@ -234,6 +237,13 @@ double FlowAdaptor::UpdateFractionalFlow(ScaLBL_ColorModel &M) {
//ScaLBL_CopyToDevice(Phi,phase.data(),7*Np*sizeof(double));
ScaLBL_CopyToDevice(M.Aq, Aq_tmp, 7 * Np * sizeof(double));
ScaLBL_CopyToDevice(M.Bq, Bq_tmp, 7 * Np * sizeof(double));
delete Aq_tmp;
delete Bq_tmp;
delete Vel_x;
delete Vel_y;
delete Vel_z;
delete Phase;
return (TOTAL_MASS_CHANGE);
}
@@ -403,7 +413,6 @@ double FlowAdaptor::ShellAggregation(ScaLBL_ColorModel &M,
}
}
}
if (rank == 0)
printf("Pathway volume / next largest ganglion %f \n",
volume_connected / second_biggest);
@@ -585,6 +594,8 @@ double FlowAdaptor::SeedPhaseField(ScaLBL_ColorModel &M,
//ScaLBL_CopyToDevice(Phi,phase.data(),7*Np*sizeof(double));
ScaLBL_CopyToDevice(M.Aq, Aq_tmp, 7 * Np * sizeof(double));
ScaLBL_CopyToDevice(M.Bq, Bq_tmp, 7 * Np * sizeof(double));
delete Aq_tmp;
delete Bq_tmp;
return (mass_loss);
}

View File

@@ -67,6 +67,7 @@ Minkowski::~Minkowski() {
void Minkowski::ComputeScalar(const DoubleArray &Field, const double isovalue) {
PROFILE_START("ComputeScalar");
Xi = Ji = Ai = 0.0;
DCEL object;
int e1, e2, e3;
@@ -160,6 +161,7 @@ void Minkowski::MeasureObject() {
* 1 - labels the rest of the
*/
//DoubleArray smooth_distance(Nx,Ny,Nz);
for (int k = 0; k < Nz; k++) {
for (int j = 0; j < Ny; j++) {
for (int i = 0; i < Nx; i++) {
@@ -168,6 +170,8 @@ void Minkowski::MeasureObject() {
}
}
CalcDist(distance, id, *Dm);
Dm->CommunicateMeshHalo(distance);
//Mean3D(distance,smooth_distance);
//Eikonal(distance, id, *Dm, 20, {true, true, true});
ComputeScalar(distance, 0.0);
@@ -179,7 +183,7 @@ void Minkowski::MeasureObject(double factor, const DoubleArray &Phi) {
*
* THIS ALGORITHM ASSUMES THAT id() is populated with phase id to distinguish objects
* 0 - labels the object
* 1 - labels the rest of the
* 1 - labels the rest
*/
for (int k = 0; k < Nz; k++) {
for (int j = 0; j < Ny; j++) {

View File

@@ -411,6 +411,7 @@ void SubPhase::Basic() {
dir_z = 1.0;
force_mag = 1.0;
}
double Porosity = (gwb.V + gnb.V)/Dm->Volume;
double saturation = gwb.V / (gwb.V + gnb.V);
double water_flow_rate =
gwb.V * (gwb.Px * dir_x + gwb.Py * dir_y + gwb.Pz * dir_z) / gwb.M /
@@ -429,11 +430,11 @@ void SubPhase::Basic() {
//double total_flow_rate = water_flow_rate + not_water_flow_rate;
//double fractional_flow = water_flow_rate / total_flow_rate;
double h = Dm->voxel_length;
double krn = h * h * nu_n * not_water_flow_rate / force_mag;
double krw = h * h * nu_w * water_flow_rate / force_mag;
double krn = h * h * nu_n * Porosity* Porosity * not_water_flow_rate / force_mag;
double krw = h * h * nu_w * Porosity* Porosity* water_flow_rate / force_mag;
/* not counting films */
double krnf = krn - h * h * nu_n * not_water_film_flow_rate / force_mag;
double krwf = krw - h * h * nu_w * water_film_flow_rate / force_mag;
double krnf = krn - h * h * nu_n * Porosity* Porosity * not_water_film_flow_rate / force_mag;
double krwf = krw - h * h * nu_w * Porosity* Porosity * water_film_flow_rate / force_mag;
double eff_pressure = 1.0 / (krn + krw); // effective pressure drop
fprintf(TIMELOG,
@@ -595,7 +596,7 @@ void SubPhase::Full() {
for (j = 0; j < Ny; j++) {
for (i = 0; i < Nx; i++) {
n = k * Nx * Ny + j * Nx + i;
if (!(Dm->id[n] > 0)) {
if (SDs(n) <= 0.0) {
// Solid phase
morph_n->id(i, j, k) = 1;
@@ -642,7 +643,7 @@ void SubPhase::Full() {
for (j = 0; j < Ny; j++) {
for (i = 0; i < Nx; i++) {
n = k * Nx * Ny + j * Nx + i;
if (!(Dm->id[n] > 0)) {
if (SDs(n) <= 0.0) {
// Solid phase
morph_w->id(i, j, k) = 1;
@@ -688,7 +689,7 @@ void SubPhase::Full() {
for (j = 0; j < Ny; j++) {
for (i = 0; i < Nx; i++) {
n = k * Nx * Ny + j * Nx + i;
if (!(Dm->id[n] > 0)) {
if (SDs(n) <= 0.0) {
// Solid phase
morph_i->id(i, j, k) = 1;
} else if (DelPhi(n) > 1e-4) {
@@ -731,7 +732,7 @@ void SubPhase::Full() {
for (i = imin; i < Nx - 1; i++) {
n = k * Nx * Ny + j * Nx + i;
// Compute volume averages
if (Dm->id[n] > 0) {
if (SDs(n) > 0.0) {
// compute density
double nA = Rho_n(n);
double nB = Rho_w(n);

View File

@@ -14,22 +14,6 @@
You should have received a copy of the GNU General Public License
along with OPM. If not, see <http://www.gnu.org/licenses/>.
*/
/*
Copyright 2013--2018 James E. McClure, Virginia Polytechnic & State University
Copyright Equnior ASA
This file is part of the Open Porous Media project (OPM).
OPM is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
OPM is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with OPM. If not, see <http://www.gnu.org/licenses/>.
*/
#include "analysis/TwoPhase.h"
#include "analysis/pmmc.h"
@@ -41,6 +25,8 @@
#include "IO/MeshDatabase.h"
#include "IO/Reader.h"
#include "IO/Writer.h"
#include "analysis/filters.h"
#include <memory>
@@ -375,6 +361,7 @@ void TwoPhase::Initialize() {
wwndnw = 0.0;
wwnsdnwn = 0.0;
Jwnwwndnw = 0.0;
Xwn = Xns = Xws = 0.0;
}
void TwoPhase::SetParams(double rhoA, double rhoB, double tauA, double tauB,
@@ -430,28 +417,38 @@ void TwoPhase::UpdateSolid() {
void TwoPhase::UpdateMeshValues() {
int i, j, k, n;
fillHalo<double> fillData(Dm->Comm, Dm->rank_info, {Nx-2,Ny-2,Nz-2}, {1, 1, 1}, 0, 1);
//...........................................................................
Dm->CommunicateMeshHalo(SDn);
//Dm->CommunicateMeshHalo(SDn);
fillData.fill(SDn);
//...........................................................................
// Compute the gradients of the phase indicator and signed distance fields
pmmc_MeshGradient(SDn, SDn_x, SDn_y, SDn_z, Nx, Ny, Nz);
//...........................................................................
// Gradient of the phase indicator field
fillData.fill(SDn_x);
fillData.fill(SDn_y);
fillData.fill(SDn_z);
fillData.fill(SDs);
//...........................................................................
Dm->CommunicateMeshHalo(SDn_x);
//Dm->CommunicateMeshHalo(SDn_x);
//...........................................................................
Dm->CommunicateMeshHalo(SDn_y);
//Dm->CommunicateMeshHalo(SDn_y);
//...........................................................................
Dm->CommunicateMeshHalo(SDn_z);
//Dm->CommunicateMeshHalo(SDn_z);
//...........................................................................
Dm->CommunicateMeshHalo(SDs);
//Dm->CommunicateMeshHalo(SDs);
pmmc_MeshGradient(SDs, SDs_x, SDs_y, SDs_z, Nx, Ny, Nz);
//...........................................................................
Dm->CommunicateMeshHalo(SDs_x);
fillData.fill(SDs_x);
fillData.fill(SDs_y);
fillData.fill(SDs_z);
//Dm->CommunicateMeshHalo(SDs_x);
//...........................................................................
Dm->CommunicateMeshHalo(SDs_y);
//Dm->CommunicateMeshHalo(SDs_y);
//...........................................................................
Dm->CommunicateMeshHalo(SDs_z);
//Dm->CommunicateMeshHalo(SDs_z);
//...........................................................................
// Compute the mesh curvature of the phase indicator field
pmmc_MeshCurvature(SDn, MeanCurvature, GaussCurvature, Nx, Ny, Nz);
@@ -579,6 +576,7 @@ void TwoPhase::ComputeLocal() {
Kwn += pmmc_CubeSurfaceInterpValue(
CubeValues, GaussCurvature, nw_pts, nw_tris, Values, i,
j, k, n_nw_pts, n_nw_tris);
Jwn += pmmc_CubeSurfaceInterpValue(
CubeValues, MeanCurvature, nw_pts, nw_tris, Values, i,
j, k, n_nw_pts, n_nw_tris);
@@ -609,7 +607,7 @@ void TwoPhase::ComputeLocal() {
efawns += pmmc_CubeContactAngle(
CubeValues, Values, SDn_x, SDn_y, SDn_z, SDs_x, SDs_y,
SDs_z, local_nws_pts, i, j, k, n_local_nws_pts);
wwnsdnwn += pmmc_CommonCurveSpeed(
CubeValues, dPdt, vawns, SDn_x, SDn_y, SDn_z, SDs_x,
SDs_y, SDs_z, local_nws_pts, i, j, k, n_local_nws_pts);
@@ -715,6 +713,217 @@ void TwoPhase::ComputeLocal() {
nonwet_morph->ComputeScalar(phase_distance, 0.f);
//printf("rank=%i completed \n",Dm->rank());
}
void TwoPhase::ComputeStatic() {
int i, j, k, n, imin, jmin, kmin, kmax;
int cube[8][3] = {{0, 0, 0}, {1, 0, 0}, {0, 1, 0}, {1, 1, 0},
{0, 0, 1}, {1, 0, 1}, {0, 1, 1}, {1, 1, 1}};
kmin = 1;
kmax = Nz - 1;
imin = jmin = 1;
/* set fluid isovalue to "grow" NWP for contact angle measurement */
fluid_isovalue = -1.0;
string FILENAME = "ContactAngle";
char LocalRankString[8];
char LocalRankFilename[40];
sprintf(LocalRankString, "%05d", Dm->rank());
sprintf(LocalRankFilename, "%s%s%s", "ContactAngle.", LocalRankString,".csv");
FILE *ANGLES = fopen(LocalRankFilename, "a+");
fprintf(ANGLES,"x y z angle\n");
for (k = kmin; k < kmax; k++) {
for (j = jmin; j < Ny - 1; j++) {
for (i = imin; i < Nx - 1; i++) {
//...........................................................................
n_nw_pts = n_ns_pts = n_ws_pts = n_nws_pts = n_local_sol_pts =
n_local_nws_pts = 0;
n_nw_tris = n_ns_tris = n_ws_tris = n_nws_seg =
n_local_sol_tris = 0;
//...........................................................................
// Compute volume averages
for (int p = 0; p < 8; p++) {
n = i + cube[p][0] + (j + cube[p][1]) * Nx +
(k + cube[p][2]) * Nx * Ny;
if (Dm->id[n] > 0) {
// 1-D index for this cube corner
// compute the norm of the gradient of the phase indicator field
// Compute the non-wetting phase volume contribution
if (Phase(i + cube[p][0], j + cube[p][1],
k + cube[p][2]) > 0) {
nwp_volume += 0.125;
} else {
wp_volume += 0.125;
}
}
}
//...........................................................................
// Construct the interfaces and common curve
pmmc_ConstructLocalCube(
SDs, SDn, solid_isovalue, fluid_isovalue, nw_pts, nw_tris,
Values, ns_pts, ns_tris, ws_pts, ws_tris, local_nws_pts,
nws_pts, nws_seg, local_sol_pts, local_sol_tris,
n_local_sol_tris, n_local_sol_pts, n_nw_pts, n_nw_tris,
n_ws_pts, n_ws_tris, n_ns_tris, n_ns_pts, n_local_nws_pts,
n_nws_pts, n_nws_seg, i, j, k, Nx, Ny, Nz);
// wn interface averages
if (n_nw_pts > 0) {
awn += pmmc_CubeSurfaceOrientation(Gwn, nw_pts, nw_tris,
n_nw_tris);
Kwn += pmmc_CubeSurfaceInterpValue(
CubeValues, GaussCurvature, nw_pts, nw_tris, Values, i,
j, k, n_nw_pts, n_nw_tris);
Jwn += pmmc_CubeSurfaceInterpValue(
CubeValues, MeanCurvature, nw_pts, nw_tris, Values, i,
j, k, n_nw_pts, n_nw_tris);
Xwn += geomavg_EulerCharacteristic(nw_pts, nw_tris, n_nw_pts,
n_nw_tris, i, j, k);
// Integrate the trimmed mean curvature (hard-coded to use a distance of 4 pixels)
pmmc_CubeTrimSurfaceInterpValues(
CubeValues, MeanCurvature, SDs, nw_pts, nw_tris, Values,
DistanceValues, i, j, k, n_nw_pts, n_nw_tris, trimdist,
dummy, trJwn);
pmmc_CubeTrimSurfaceInterpInverseValues(
CubeValues, MeanCurvature, SDs, nw_pts, nw_tris, Values,
DistanceValues, i, j, k, n_nw_pts, n_nw_tris, trimdist,
dummy, trRwn);
}
// wns common curve averages
if (n_local_nws_pts > 0) {
efawns += pmmc_CubeContactAngle(
CubeValues, Values, SDn_x, SDn_y, SDn_z, SDs_x, SDs_y,
SDs_z, local_nws_pts, i, j, k, n_local_nws_pts);
for (int p = 0; p < n_local_nws_pts; p++) {
// Extract the line segment
Point A = local_nws_pts(p);
double value = Values(p);
fprintf(ANGLES, "%.8g %.8g %.8g %.8g\n", A.x, A.y, A.z, value);
}
pmmc_CurveCurvature(SDn, SDs, SDn_x, SDn_y, SDn_z, SDs_x,
SDs_y, SDs_z, KNwns_values,
KGwns_values, KNwns, KGwns, nws_pts,
n_nws_pts, i, j, k);
lwns +=
pmmc_CubeCurveLength(local_nws_pts, n_local_nws_pts);
/* half contribution for vertices / edges at the common line
* each cube with contact line has a net of undercounting vertices
* each cube is undercounting edges due to internal counts
*/
Xwn += 0.25*n_local_nws_pts - 0.5;
Xws += 0.25*n_local_nws_pts - 0.5;
Xns += 0.25*n_local_nws_pts - 0.5;
}
// Solid interface averagees
if (n_local_sol_tris > 0) {
As += pmmc_CubeSurfaceArea(local_sol_pts, local_sol_tris,
n_local_sol_tris);
// Compute the surface orientation and the interfacial area
ans += pmmc_CubeSurfaceOrientation(Gns, ns_pts, ns_tris,
n_ns_tris);
aws += pmmc_CubeSurfaceOrientation(Gws, ws_pts, ws_tris,
n_ws_tris);
Xws += geomavg_EulerCharacteristic(ws_pts, ws_tris, n_ws_pts,
n_ws_tris, i, j, k);
Xns += geomavg_EulerCharacteristic(ns_pts, ns_tris, n_ns_pts,
n_ns_tris, i, j, k);
}
//...........................................................................
// Compute the integral curvature of the non-wetting phase
n_nw_pts = n_nw_tris = 0;
// Compute the non-wetting phase surface and associated area
An +=
geomavg_MarchingCubes(SDn, fluid_isovalue, i, j, k, nw_pts,
n_nw_pts, nw_tris, n_nw_tris);
// Compute the integral of mean curvature
if (n_nw_pts > 0) {
pmmc_CubeTrimSurfaceInterpValues(
CubeValues, MeanCurvature, SDs, nw_pts, nw_tris, Values,
DistanceValues, i, j, k, n_nw_pts, n_nw_tris, trimdist,
trawn, dummy);
}
Jn += pmmc_CubeSurfaceInterpValue(CubeValues, MeanCurvature,
nw_pts, nw_tris, Values, i, j,
k, n_nw_pts, n_nw_tris);
// Compute Euler characteristic from integral of gaussian curvature
Kn += pmmc_CubeSurfaceInterpValue(CubeValues, GaussCurvature,
nw_pts, nw_tris, Values, i, j,
k, n_nw_pts, n_nw_tris);
euler += geomavg_EulerCharacteristic(nw_pts, nw_tris, n_nw_pts,
n_nw_tris, i, j, k);
}
}
}
fclose(ANGLES);
Array<char> phase_label(Nx, Ny, Nz);
Array<double> phase_distance(Nx, Ny, Nz);
// Analyze the wetting fluid
for (k = 0; k < Nz; k++) {
for (j = 0; j < Ny; j++) {
for (i = 0; i < Nx; i++) {
n = k * Nx * Ny + j * Nx + i;
if (!(Dm->id[n] > 0)) {
// Solid phase
phase_label(i, j, k) = 1;
} else if (SDn(i, j, k) < 0.0) {
// wetting phase
phase_label(i, j, k) = 0;
} else {
// non-wetting phase
phase_label(i, j, k) = 1;
}
phase_distance(i, j, k) =
2.0 * double(phase_label(i, j, k)) - 1.0;
}
}
}
CalcDist(phase_distance, phase_label, *Dm);
wet_morph->ComputeScalar(phase_distance, 0.f);
//printf("generating distance at rank=%i \n",Dm->rank());
// Analyze the wetting fluid
for (k = 0; k < Nz; k++) {
for (j = 0; j < Ny; j++) {
for (i = 0; i < Nx; i++) {
n = k * Nx * Ny + j * Nx + i;
if (!(Dm->id[n] > 0)) {
// Solid phase
phase_label(i, j, k) = 1;
} else if (SDn(i, j, k) < 0.0) {
// wetting phase
phase_label(i, j, k) = 1;
} else {
// non-wetting phase
phase_label(i, j, k) = 0;
}
phase_distance(i, j, k) =
2.0 * double(phase_label(i, j, k)) - 1.0;
}
}
}
CalcDist(phase_distance, phase_label, *Dm);
nonwet_morph->ComputeScalar(phase_distance, 0.f);
}
void TwoPhase::AssignComponentLabels() {
//int LabelNWP=1;
@@ -1219,7 +1428,7 @@ void TwoPhase::ComponentAverages() {
void TwoPhase::Reduce() {
int i;
double iVol_global = 1.0 / Volume;
//double iVol_global = 1.0 / Volume;
//...........................................................................
Dm->Comm.barrier();
nwp_volume_global = Dm->Comm.sumReduce(nwp_volume);
@@ -1258,10 +1467,14 @@ void TwoPhase::Reduce() {
trawn_global = Dm->Comm.sumReduce(trawn);
trJwn_global = Dm->Comm.sumReduce(trJwn);
trRwn_global = Dm->Comm.sumReduce(trRwn);
euler_global = Dm->Comm.sumReduce(euler);
Xwn_global = Dm->Comm.sumReduce(Xwn);
Xws_global = Dm->Comm.sumReduce(Xws);
Xns_global = Dm->Comm.sumReduce(Xns);
An_global = Dm->Comm.sumReduce(An);
Jn_global = Dm->Comm.sumReduce(Jn);
Kn_global = Dm->Comm.sumReduce(Kn);
euler_global = Dm->Comm.sumReduce(euler);
Dm->Comm.barrier();
// Normalize the phase averages
@@ -1285,17 +1498,18 @@ void TwoPhase::Reduce() {
// Normalize surface averages by the interfacial area
if (awn_global > 0.0) {
Jwn_global /= awn_global;
Kwn_global /= awn_global;
//Kwn_global /= awn_global;
wwndnw_global /= awn_global;
for (i = 0; i < 3; i++)
vawn_global(i) /= awn_global;
for (i = 0; i < 6; i++)
Gwn_global(i) /= awn_global;
}
if (lwns_global > 0.0) {
efawns_global /= lwns_global;
KNwns_global /= lwns_global;
KGwns_global /= lwns_global;
//KNwns_global /= lwns_global;
//KGwns_global /= lwns_global;
for (i = 0; i < 3; i++)
vawns_global(i) /= lwns_global;
}
@@ -1314,15 +1528,17 @@ void TwoPhase::Reduce() {
Gws_global(i) /= aws_global;
euler_global /= (2 * PI);
//sat_w = 1.0 - nwp_volume_global*iVol_global/porosity;
sat_w = 1.0 - nwp_volume_global / (nwp_volume_global + wp_volume_global);
// Compute the specific interfacial areas and common line length (dimensionless per unit volume)
/*
awn_global = awn_global * iVol_global;
ans_global = ans_global * iVol_global;
aws_global = aws_global * iVol_global;
dEs = dEs * iVol_global;
lwns_global = lwns_global * iVol_global;
*/
}
void TwoPhase::NonDimensionalize(double D, double viscosity, double IFT) {
@@ -1334,6 +1550,55 @@ void TwoPhase::NonDimensionalize(double D, double viscosity, double IFT) {
lwns_global *= D * D;
}
void TwoPhase::PrintStatic() {
if (Dm->rank() == 0) {
FILE *STATIC;
STATIC = fopen("geometry.csv", "a+");
if (fseek(STATIC, 0, SEEK_SET) == fseek(STATIC, 0, SEEK_CUR)) {
// If timelog is empty, write a short header to list the averages
fprintf(STATIC, "sw awn ans aws Jwn Kwn lwns cwns KGws "
"KGwn Xwn Xws Xns "); // Scalar averages
fprintf(STATIC,
"Gwnxx Gwnyy Gwnzz Gwnxy Gwnxz Gwnyz "); // Orientation tensors
fprintf(STATIC, "Gwsxx Gwsyy Gwszz Gwsxy Gwsxz Gwsyz ");
fprintf(STATIC, "Gnsxx Gnsyy Gnszz Gnsxy Gnsxz Gnsyz ");
fprintf(STATIC, "trawn trJwn trRwn "); //trimmed curvature,
fprintf(STATIC, "Vw Aw Jw Xw "); //miknowski measures,
fprintf(STATIC, "Vn An Jn Xn\n"); //miknowski measures,
//fprintf(STATIC,"Euler Kn2 Jn2 An2\n"); //miknowski measures,
}
fprintf(STATIC, "%.5g ", sat_w); // saturation
fprintf(STATIC, "%.5g %.5g %.5g ", awn_global, ans_global,
aws_global); // interfacial areas
fprintf(STATIC, "%.5g %.5g ", Jwn_global,
Kwn_global); // curvature of wn interface
fprintf(STATIC, "%.5g ", lwns_global); // common curve length
fprintf(STATIC, "%.5g ", efawns_global); // average contact angle
fprintf(STATIC, "%.5g %.5g ", KNwns_global,
KGwns_global); // total curvature contributions of common line
fprintf(STATIC, "%.5g %.5g %.5g ", Xwn_global, Xns_global,
Xws_global); // Euler characteristic
fprintf(STATIC, "%.5g %.5g %.5g %.5g %.5g %.5g ", Gwn_global(0),
Gwn_global(1), Gwn_global(2), Gwn_global(3), Gwn_global(4),
Gwn_global(5)); // orientation of wn interface
fprintf(STATIC, "%.5g %.5g %.5g %.5g %.5g %.5g ", Gns_global(0),
Gns_global(1), Gns_global(2), Gns_global(3), Gns_global(4),
Gns_global(5)); // orientation of ns interface
fprintf(STATIC, "%.5g %.5g %.5g %.5g %.5g %.5g ", Gws_global(0),
Gws_global(1), Gws_global(2), Gws_global(3), Gws_global(4),
Gws_global(5)); // orientation of ws interface
fprintf(STATIC, "%.5g %.5g %.5g ", trawn_global, trJwn_global,
trRwn_global); // Trimmed curvature
fprintf(STATIC, "%.5g %.5g %.5g %.5g ", wet_morph->V(), wet_morph->A(),
wet_morph->H(), wet_morph->X());
fprintf(STATIC, "%.5g %.5g %.5g %.5g\n", nonwet_morph->V(),
nonwet_morph->A(), nonwet_morph->H(), nonwet_morph->X());
//fprintf(STATIC,"%.5g %.5g %.5g %.5g\n",euler_global, Kn_global, Jn_global, An_global); // minkowski measures
fclose(STATIC);
}
}
void TwoPhase::PrintAll(int timestep) {
if (Dm->rank() == 0) {
fprintf(TIMELOG, "%i %.5g %.5g %.5g %.5g %.5g %.5g %.5g %.5g ",

View File

@@ -103,7 +103,9 @@ public:
double lwns_global;
double efawns, efawns_global; // averaged contact angle
double euler, Kn, Jn, An;
double Xwn, Xns, Xws;
double euler_global, Kn_global, Jn_global, An_global;
double Xwn_global, Xns_global, Xws_global;
double rho_n, rho_w;
double nu_n, nu_w;
@@ -184,6 +186,8 @@ public:
void ColorToSignedDistance(double Beta, DoubleArray &ColorData,
DoubleArray &DistData);
void ComputeLocal();
void ComputeStatic();
void PrintStatic();
void AssignComponentLabels();
void ComponentAverages();
void Reduce();

View File

@@ -137,8 +137,7 @@ void Morphology::Initialize(std::shared_ptr<Domain> Dm, DoubleArray &Distance) {
morphRadius.resize(recvLoc);
//..............................
/* send the morphological radius */
Dm->Comm.Irecv(&morphRadius[recvOffset_x], recvCount, Dm->rank_x(),
recvtag + 0);
Dm->Comm.Irecv(&morphRadius[recvOffset_x], recvCount, Dm->rank_x(), recvtag + 0);
Dm->Comm.send(&tmpDistance[0], sendCount, Dm->rank_X(), sendtag + 0);
/* send the shift values */
Dm->Comm.Irecv(&xShift[recvOffset_x], recvCount, Dm->rank_x(), recvtag + 1);
@@ -502,7 +501,7 @@ double MorphOpen(DoubleArray &SignDist, signed char *id,
if (rank == 0)
printf("Maximum pore size: %f \n", maxdistGlobal);
final_void_fraction = volume_fraction; //initialize
int ii, jj, kk;
int imin, jmin, kmin, imax, jmax, kmax;
int Nx = nx;
@@ -524,27 +523,28 @@ double MorphOpen(DoubleArray &SignDist, signed char *id,
int numTry = 0;
int maxTry = 100;
while (void_fraction_new > VoidFraction && numTry < maxTry) {
while ( !(void_fraction_new < VoidFraction) && numTry < maxTry) {
numTry++;
void_fraction_diff_old = void_fraction_diff_new;
void_fraction_old = void_fraction_new;
Rcrit_old = Rcrit_new;
Rcrit_new -= deltaR * Rcrit_old;
if (rank==0) printf("Try %i with radius %f \n", numTry, Rcrit_new);
if (Rcrit_new < 0.5) {
numTry = maxTry;
}
int Window = round(Rcrit_new);
if (Window == 0)
Window =
1; // If Window = 0 at the begining, after the following process will have sw=1.0
Window = 1; // If Window = 0 at the begining, after the following process will have sw=1.0
// and sw<Sw will be immediately broken
double LocalNumber = 0.f;
for (int k = 0; k < Nz; k++) {
for (int j = 0; j < Ny; j++) {
for (int i = 0; i < Nx; i++) {
for (int k = 1; k < Nz-1; k++) {
for (int j = 1; j < Ny-1; j++) {
for (int i = 1; i < Nx-1; i++) {
n = k * nx * ny + j * nx + i;
if (SignDist(i, j, k) > Rcrit_new) {
// loop over the window and update
//printf("Distance(%i %i %i) = %f \n",i,j,k, SignDist(i,j,k));
imin = max(1, i - Window);
jmin = max(1, j - Window);
kmin = max(1, k - Window);
@@ -571,7 +571,7 @@ double MorphOpen(DoubleArray &SignDist, signed char *id,
}
}
}
LocalNumber += Structure.GetOverlaps(Dm, id, ErodeLabel, NewLabel);
//LocalNumber += Structure.GetOverlaps(Dm, id, ErodeLabel, NewLabel);
count = 0.f;
for (int k = 1; k < Nz - 1; k++) {
@@ -611,7 +611,7 @@ double MorphOpen(DoubleArray &SignDist, signed char *id,
//***************************************************************************************
double MorphDrain(DoubleArray &SignDist, signed char *id,
std::shared_ptr<Domain> Dm, double VoidFraction) {
std::shared_ptr<Domain> Dm, double VoidFraction, double InitialRadius) {
// SignDist is the distance to the object that you want to constaing the morphological opening
// VoidFraction is the the empty space where the object inst
// id is a labeled map
@@ -688,6 +688,11 @@ double MorphDrain(DoubleArray &SignDist, signed char *id,
double deltaR = 0.05; // amount to change the radius in voxel units
double Rcrit_old = maxdistGlobal;
double Rcrit_new = maxdistGlobal;
if (InitialRadius < maxdistGlobal){
Rcrit_old = InitialRadius;
Rcrit_new = InitialRadius;
}
//if (argc>2){
// Rcrit_new = strtod(argv[2],NULL);
// if (rank==0) printf("Max. distance =%f, Initial critical radius = %f \n",maxdistGlobal,Rcrit_new);

View File

@@ -7,7 +7,7 @@ double MorphOpen(DoubleArray &SignDist, signed char *id,
std::shared_ptr<Domain> Dm, double VoidFraction,
signed char ErodeLabel, signed char ReplaceLabel);
double MorphDrain(DoubleArray &SignDist, signed char *id,
std::shared_ptr<Domain> Dm, double VoidFraction);
std::shared_ptr<Domain> Dm, double VoidFraction, double InitialRadius);
double MorphGrow(DoubleArray &BoundaryDist, DoubleArray &Dist, Array<char> &id,
std::shared_ptr<Domain> Dm, double TargetVol,
double WallFactor);

View File

@@ -4057,7 +4057,7 @@ inline double pmmc_CubeContactAngle(DoubleArray &CubeValues,
(A.z - B.z) * (A.z - B.z));
integral += 0.5 * length * (vA + vB);
}
return integral;
}
//--------------------------------------------------------------------------------------------------------
@@ -4420,7 +4420,9 @@ inline void pmmc_CurveCurvature(DoubleArray &f, DoubleArray &s,
double twnsx, twnsy, twnsz, nwnsx, nwnsy, nwnsz,
K; // tangent,normal and curvature
double nsx, nsy, nsz, norm;
double nwx, nwy, nwz;
double nwsx, nwsy, nwsz;
double nwnx, nwny, nwnz;
Point P, A, B;
// Local trilinear approximation for tangent and normal vector
@@ -4430,47 +4432,18 @@ inline void pmmc_CurveCurvature(DoubleArray &f, DoubleArray &s,
for (k = kc; k < kc + 2; k++) {
for (j = jc; j < jc + 2; j++) {
for (i = ic; i < ic + 2; i++) {
// Compute all of the derivatives using finite differences
// fluid phase indicator field
// fx = 0.5*(f(i+1,j,k) - f(i-1,j,k));
// fy = 0.5*(f(i,j+1,k) - f(i,j-1,k));
// fz = 0.5*(f(i,j,k+1) - f(i,j,k-1));
/*fxx = f(i+1,j,k) - 2.0*f(i,j,k) + f(i-1,j,k);
fyy = f(i,j+1,k) - 2.0*f(i,j,k) + f(i,j-1,k);
fzz = f(i,j,k+1) - 2.0*f(i,j,k) + f(i,j,k-1);
fxy = 0.25*(f(i+1,j+1,k) - f(i+1,j-1,k) - f(i-1,j+1,k) + f(i-1,j-1,k));
fxz = 0.25*(f(i+1,j,k+1) - f(i+1,j,k-1) - f(i-1,j,k+1) + f(i-1,j,k-1));
fyz = 0.25*(f(i,j+1,k+1) - f(i,j+1,k-1) - f(i,j-1,k+1) + f(i,j-1,k-1));
*/
// solid distance function
// sx = 0.5*(s(i+1,j,k) - s(i-1,j,k));
// sy = 0.5*(s(i,j+1,k) - s(i,j-1,k));
// sz = 0.5*(s(i,j,k+1) - s(i,j,k-1));
/* sxx = s(i+1,j,k) - 2.0*s(i,j,k) + s(i-1,j,k);
syy = s(i,j+1,k) - 2.0*s(i,j,k) + s(i,j-1,k);
szz = s(i,j,k+1) - 2.0*s(i,j,k) + s(i,j,k-1);
sxy = 0.25*(s(i+1,j+1,k) - s(i+1,j-1,k) - s(i-1,j+1,k) + s(i-1,j-1,k));
sxz = 0.25*(s(i+1,j,k+1) - s(i+1,j,k-1) - s(i-1,j,k+1) + s(i-1,j,k-1));
syz = 0.25*(s(i,j+1,k+1) - s(i,j+1,k-1) - s(i,j-1,k+1) + s(i,j-1,k-1));
*/
/* // Compute the Jacobean matrix for tangent vector
Axx = sxy*fz + sy*fxz - sxz*fy - sz*fxy;
Axy = sxz*fx + sz*fxx - sxx*fz - sx*fxz;
Axz = sxx*fy + sx*fxy - sxy*fx - sy*fxx;
Ayx = syy*fz + sy*fyz - syz*fy - sz*fyy;
Ayy = syz*fx + sz*fxy - sxy*fz - sx*fyz;
Ayz = sxy*fy + sx*fyy - syy*fx - sy*fxy;
Azx = syz*fz + sy*fzz - szz*fy - sz*fyz;
Azy = szz*fx + sz*fxz - sxz*fz - sx*fzz;
Azz = sxz*fy + sx*fyz - syz*fx - sy*fxz;
*/
sx = s_x(i, j, k);
sy = s_y(i, j, k);
sz = s_z(i, j, k);
fx = f_x(i, j, k);
fy = f_y(i, j, k);
fz = f_z(i, j, k);
// Normal to fluid surface
Nx.Corners(i - ic, j - jc, k - kc) = fx;
Ny.Corners(i - ic, j - jc, k - kc) = fy;
Nz.Corners(i - ic, j - jc, k - kc) = fz;
// Normal to solid surface
Sx.Corners(i - ic, j - jc, k - kc) = sx;
Sy.Corners(i - ic, j - jc, k - kc) = sy;
@@ -4577,8 +4550,19 @@ inline void pmmc_CurveCurvature(DoubleArray &f, DoubleArray &s,
nsx /= norm;
nsy /= norm;
nsz /= norm;
// Normal vector to the fluid surface
nwx = Nx.eval(P);
nwy = Ny.eval(P);
nwz = Nz.eval(P);
norm = sqrt(nwx * nwx + nwy * nwy + nwz * nwz);
if (norm == 0.0)
norm = 1.0;
nwx /= norm;
nwy /= norm;
nwz /= norm;
// normal in the surface tangent plane (rel. geodesic curvature)
// common curve normal in the solid surface tangent plane (rel. geodesic curvature)
nwsx = twnsy * nsz - twnsz * nsy;
nwsy = twnsz * nsx - twnsx * nsz;
nwsz = twnsx * nsy - twnsy * nsx;
@@ -4588,16 +4572,35 @@ inline void pmmc_CurveCurvature(DoubleArray &f, DoubleArray &s,
nwsx /= norm;
nwsy /= norm;
nwsz /= norm;
if (nsx * nwnsx + nsy * nwnsy + nsz * nwnsz < 0.0) {
nwnsx = -nwnsx;
nwnsy = -nwnsy;
nwnsz = -nwnsz;
/* normal to ws interface boundary should point into fluid (same direction as gradient) */
if (nwx * nwsx + nwy * nwsy + nwz * nwsz < 0.0) {
nwsx = -nwsx;
nwsy = -nwsy;
nwsz = -nwsz;
}
// common curve normal in the fluid surface tangent plane (rel. geodesic curvature)
nwnx = twnsy * nwz - twnsz * nwy;
nwny = twnsz * nwx - twnsx * nwz;
nwnz = twnsx * nwy - twnsy * nwx;
norm = sqrt(nwnx * nwnx + nwny * nwny + nwnz * nwnz);
if (norm == 0.0)
norm = 1.0;
nwnx /= norm;
nwny /= norm;
nwnz /= norm;
/* normal to wn interface boundary should point into the solid */
if (nsx * nwnx + nsy * nwny + nsz * nwnz > 0.0) {
nwnx = -nwnx;
nwny = -nwny;
nwnz = -nwnz;
}
if (length > 0.0) {
// normal curvature component in the direction of the solid surface
KNavg += K * (nsx * nwnsx + nsy * nwnsy + nsz * nwnsz) * length;
//KNavg += K * (nsx * nwnsx + nsy * nwnsy + nsz * nwnsz) * length;
KNavg += K * (nwnx * nwnsx + nwny * nwnsy + nwnz * nwnsz) * length;
//geodesic curvature
KGavg += K * (nwsx * nwnsx + nwsy * nwnsy + nwsz * nwnsz) * length;
}

View File

@@ -193,6 +193,12 @@ MACRO( FIND_FILES )
# Find the CUDA sources
SET( T_CUDASOURCES "" )
FILE( GLOB T_CUDASOURCES "*.cu" )
# Find the HIP sources
SET( T_HIPSOURCES "" )
FILE( GLOB T_HIPSOURCES "*.hip" )
# Find the SYCL sources
SET( T_SYCLSOURCES "" )
FILE( GLOB T_SYCLSOURCES "*.dp.cpp" )
# Find the C sources
SET( T_CSOURCES "" )
FILE( GLOB T_CSOURCES "*.c" )
@@ -212,10 +218,12 @@ MACRO( FIND_FILES )
SET( HEADERS ${HEADERS} ${T_HEADERS} )
SET( CXXSOURCES ${CXXSOURCES} ${T_CXXSOURCES} )
SET( CUDASOURCES ${CUDASOURCES} ${T_CUDASOURCES} )
SET( HIPSOURCES ${HIPSOURCES} ${T_HIPSOURCES} )
SET( SYCLSOURCES ${SYCLSOURCES} ${T_SYCLSOURCES} )
SET( CSOURCES ${CSOURCES} ${T_CSOURCES} )
SET( FSOURCES ${FSOURCES} ${T_FSOURCES} )
SET( M4FSOURCES ${M4FSOURCES} ${T_M4FSOURCES} )
SET( SOURCES ${SOURCES} ${T_CXXSOURCES} ${T_CSOURCES} ${T_FSOURCES} ${T_M4FSOURCES} ${CUDASOURCES} )
SET( SOURCES ${SOURCES} ${T_CXXSOURCES} ${T_CSOURCES} ${T_FSOURCES} ${T_M4FSOURCES} ${CUDASOURCES} ${HIPSOURCES} ${SYCLSOURCES})
ENDMACRO()
@@ -227,6 +235,12 @@ MACRO( FIND_FILES_PATH IN_PATH )
# Find the CUDA sources
SET( T_CUDASOURCES "" )
FILE( GLOB T_CUDASOURCES "${IN_PATH}/*.cu" )
# Find the HIP sources
SET( T_HIPSOURCES "" )
FILE( GLOB T_HIPSOURCES "${IN_PATH}/*.hip" )
# Find the SYCL sources
SET( T_SYCLSOURCES "" )
FILE( GLOB T_SYCLSOURCES "${IN_PATH}/*.sycl" )
# Find the C sources
SET( T_CSOURCES "" )
FILE( GLOB T_CSOURCES "${IN_PATH}/*.c" )
@@ -246,9 +260,11 @@ MACRO( FIND_FILES_PATH IN_PATH )
SET( HEADERS ${HEADERS} ${T_HEADERS} )
SET( CXXSOURCES ${CXXSOURCES} ${T_CXXSOURCES} )
SET( CUDASOURCES ${CUDASOURCES} ${T_CUDASOURCES} )
SET( HIPSOURCES ${HIPSOURCES} ${T_HIPSOURCES} )
SET( SYCLSOURCES ${SYCLSOURCES} ${T_SYCLSOURCES} )
SET( CSOURCES ${CSOURCES} ${T_CSOURCES} )
SET( FSOURCES ${FSOURCES} ${T_FSOURCES} )
SET( SOURCES ${SOURCES} ${T_CXXSOURCES} ${T_CSOURCES} ${T_FSOURCES} ${CUDASOURCES} )
SET( SOURCES ${SOURCES} ${T_CXXSOURCES} ${T_CSOURCES} ${T_FSOURCES} ${CUDASOURCES} ${HIPSOURCES} ${SYCLSOURCES} )
ENDMACRO()

View File

@@ -20,10 +20,12 @@
#include "common/ArraySize.h"
#include <array>
#include <cstdint>
#include <functional>
#include <initializer_list>
#include <iostream>
#include <memory>
#include <stdint.h>
#include <string>
#include <vector>

View File

@@ -4,11 +4,13 @@
#include "common/Utilities.h"
#include <array>
#include <cstdint>
#include <cmath>
#include <complex>
#include <cstdlib>
#include <cstring>
#include <initializer_list>
#include <stdexcept>
#include <vector>
#if defined(__CUDA_ARCH__)

View File

@@ -208,72 +208,68 @@ inline void CommunicateSendRecvCounts(
}
//***************************************************************************************
inline void CommunicateRecvLists(
const Utilities::MPI &comm, int sendtag, int recvtag, int *sendList_x,
int *sendList_y, int *sendList_z, int *sendList_X, int *sendList_Y,
int *sendList_Z, int *sendList_xy, int *sendList_XY, int *sendList_xY,
int *sendList_Xy, int *sendList_xz, int *sendList_XZ, int *sendList_xZ,
int *sendList_Xz, int *sendList_yz, int *sendList_YZ, int *sendList_yZ,
int *sendList_Yz, int sendCount_x, int sendCount_y, int sendCount_z,
int sendCount_X, int sendCount_Y, int sendCount_Z, int sendCount_xy,
int sendCount_XY, int sendCount_xY, int sendCount_Xy, int sendCount_xz,
int sendCount_XZ, int sendCount_xZ, int sendCount_Xz, int sendCount_yz,
int sendCount_YZ, int sendCount_yZ, int sendCount_Yz, int *recvList_x,
int *recvList_y, int *recvList_z, int *recvList_X, int *recvList_Y,
int *recvList_Z, int *recvList_xy, int *recvList_XY, int *recvList_xY,
int *recvList_Xy, int *recvList_xz, int *recvList_XZ, int *recvList_xZ,
int *recvList_Xz, int *recvList_yz, int *recvList_YZ, int *recvList_yZ,
int *recvList_Yz, int recvCount_x, int recvCount_y, int recvCount_z,
int recvCount_X, int recvCount_Y, int recvCount_Z, int recvCount_xy,
int recvCount_XY, int recvCount_xY, int recvCount_Xy, int recvCount_xz,
int recvCount_XZ, int recvCount_xZ, int recvCount_Xz, int recvCount_yz,
int recvCount_YZ, int recvCount_yZ, int recvCount_Yz, int rank_x,
int rank_y, int rank_z, int rank_X, int rank_Y, int rank_Z, int rank_xy,
int rank_XY, int rank_xY, int rank_Xy, int rank_xz, int rank_XZ,
int rank_xZ, int rank_Xz, int rank_yz, int rank_YZ, int rank_yZ,
int rank_Yz) {
MPI_Request req1[18], req2[18];
req1[0] = comm.Isend(sendList_x, sendCount_x, rank_x, sendtag);
req2[0] = comm.Irecv(recvList_X, recvCount_X, rank_X, recvtag);
req1[1] = comm.Isend(sendList_X, sendCount_X, rank_X, sendtag);
req2[1] = comm.Irecv(recvList_x, recvCount_x, rank_x, recvtag);
req1[2] = comm.Isend(sendList_y, sendCount_y, rank_y, sendtag);
req2[2] = comm.Irecv(recvList_Y, recvCount_Y, rank_Y, recvtag);
req1[3] = comm.Isend(sendList_Y, sendCount_Y, rank_Y, sendtag);
req2[3] = comm.Irecv(recvList_y, recvCount_y, rank_y, recvtag);
req1[4] = comm.Isend(sendList_z, sendCount_z, rank_z, sendtag);
req2[4] = comm.Irecv(recvList_Z, recvCount_Z, rank_Z, recvtag);
req1[5] = comm.Isend(sendList_Z, sendCount_Z, rank_Z, sendtag);
req2[5] = comm.Irecv(recvList_z, recvCount_z, rank_z, recvtag);
inline void CommunicateRecvLists( const Utilities::MPI& comm, int sendtag, int recvtag,
int *sendList_x, int *sendList_y, int *sendList_z, int *sendList_X, int *sendList_Y, int *sendList_Z,
int *sendList_xy, int *sendList_XY, int *sendList_xY, int *sendList_Xy,
int *sendList_xz, int *sendList_XZ, int *sendList_xZ, int *sendList_Xz,
int *sendList_yz, int *sendList_YZ, int *sendList_yZ, int *sendList_Yz,
int sendCount_x, int sendCount_y, int sendCount_z, int sendCount_X, int sendCount_Y, int sendCount_Z,
int sendCount_xy, int sendCount_XY, int sendCount_xY, int sendCount_Xy,
int sendCount_xz, int sendCount_XZ, int sendCount_xZ, int sendCount_Xz,
int sendCount_yz, int sendCount_YZ, int sendCount_yZ, int sendCount_Yz,
int *recvList_x, int *recvList_y, int *recvList_z, int *recvList_X, int *recvList_Y, int *recvList_Z,
int *recvList_xy, int *recvList_XY, int *recvList_xY, int *recvList_Xy,
int *recvList_xz, int *recvList_XZ, int *recvList_xZ, int *recvList_Xz,
int *recvList_yz, int *recvList_YZ, int *recvList_yZ, int *recvList_Yz,
int recvCount_x, int recvCount_y, int recvCount_z, int recvCount_X, int recvCount_Y, int recvCount_Z,
int recvCount_xy, int recvCount_XY, int recvCount_xY, int recvCount_Xy,
int recvCount_xz, int recvCount_XZ, int recvCount_xZ, int recvCount_Xz,
int recvCount_yz, int recvCount_YZ, int recvCount_yZ, int recvCount_Yz,
int rank_x, int rank_y, int rank_z, int rank_X, int rank_Y, int rank_Z, int rank_xy, int rank_XY, int rank_xY,
int rank_Xy, int rank_xz, int rank_XZ, int rank_xZ, int rank_Xz, int rank_yz, int rank_YZ, int rank_yZ, int rank_Yz)
{
MPI_Request req1[18], req2[18];
req1[0] = comm.Isend(sendList_x,sendCount_x,rank_x,sendtag+0);
req2[0] = comm.Irecv(recvList_X,recvCount_X,rank_X,recvtag+0);
req1[1] = comm.Isend(sendList_X,sendCount_X,rank_X,sendtag+1);
req2[1] = comm.Irecv(recvList_x,recvCount_x,rank_x,recvtag+1);
req1[2] = comm.Isend(sendList_y,sendCount_y,rank_y,sendtag+2);
req2[2] = comm.Irecv(recvList_Y,recvCount_Y,rank_Y,recvtag+2);
req1[3] = comm.Isend(sendList_Y,sendCount_Y,rank_Y,sendtag+3);
req2[3] = comm.Irecv(recvList_y,recvCount_y,rank_y,recvtag+3);
req1[4] = comm.Isend(sendList_z,sendCount_z,rank_z,sendtag+4);
req2[4] = comm.Irecv(recvList_Z,recvCount_Z,rank_Z,recvtag+4);
req1[5] = comm.Isend(sendList_Z,sendCount_Z,rank_Z,sendtag+5);
req2[5] = comm.Irecv(recvList_z,recvCount_z,rank_z,recvtag+5);
req1[6] = comm.Isend(sendList_xy, sendCount_xy, rank_xy, sendtag);
req2[6] = comm.Irecv(recvList_XY, recvCount_XY, rank_XY, recvtag);
req1[7] = comm.Isend(sendList_XY, sendCount_XY, rank_XY, sendtag);
req2[7] = comm.Irecv(recvList_xy, recvCount_xy, rank_xy, recvtag);
req1[8] = comm.Isend(sendList_Xy, sendCount_Xy, rank_Xy, sendtag);
req2[8] = comm.Irecv(recvList_xY, recvCount_xY, rank_xY, recvtag);
req1[9] = comm.Isend(sendList_xY, sendCount_xY, rank_xY, sendtag);
req2[9] = comm.Irecv(recvList_Xy, recvCount_Xy, rank_Xy, recvtag);
req1[6] = comm.Isend(sendList_xy,sendCount_xy,rank_xy,sendtag+6);
req2[6] = comm.Irecv(recvList_XY,recvCount_XY,rank_XY,recvtag+6);
req1[7] = comm.Isend(sendList_XY,sendCount_XY,rank_XY,sendtag+7);
req2[7] = comm.Irecv(recvList_xy,recvCount_xy,rank_xy,recvtag+7);
req1[8] = comm.Isend(sendList_Xy,sendCount_Xy,rank_Xy,sendtag+8);
req2[8] = comm.Irecv(recvList_xY,recvCount_xY,rank_xY,recvtag+8);
req1[9] = comm.Isend(sendList_xY,sendCount_xY,rank_xY,sendtag+9);
req2[9] = comm.Irecv(recvList_Xy,recvCount_Xy,rank_Xy,recvtag+9);
req1[10] = comm.Isend(sendList_xz, sendCount_xz, rank_xz, sendtag);
req2[10] = comm.Irecv(recvList_XZ, recvCount_XZ, rank_XZ, recvtag);
req1[11] = comm.Isend(sendList_XZ, sendCount_XZ, rank_XZ, sendtag);
req2[11] = comm.Irecv(recvList_xz, recvCount_xz, rank_xz, recvtag);
req1[12] = comm.Isend(sendList_Xz, sendCount_Xz, rank_Xz, sendtag);
req2[12] = comm.Irecv(recvList_xZ, recvCount_xZ, rank_xZ, recvtag);
req1[13] = comm.Isend(sendList_xZ, sendCount_xZ, rank_xZ, sendtag);
req2[13] = comm.Irecv(recvList_Xz, recvCount_Xz, rank_Xz, recvtag);
req1[10] = comm.Isend(sendList_xz,sendCount_xz,rank_xz,sendtag+10);
req2[10] = comm.Irecv(recvList_XZ,recvCount_XZ,rank_XZ,recvtag+10);
req1[11] = comm.Isend(sendList_XZ,sendCount_XZ,rank_XZ,sendtag+11);
req2[11] = comm.Irecv(recvList_xz,recvCount_xz,rank_xz,recvtag+11);
req1[12] = comm.Isend(sendList_Xz,sendCount_Xz,rank_Xz,sendtag+12);
req2[12] = comm.Irecv(recvList_xZ,recvCount_xZ,rank_xZ,recvtag+12);
req1[13] = comm.Isend(sendList_xZ,sendCount_xZ,rank_xZ,sendtag+13);
req2[13] = comm.Irecv(recvList_Xz,recvCount_Xz,rank_Xz,recvtag+13);
req1[14] = comm.Isend(sendList_yz, sendCount_yz, rank_yz, sendtag);
req2[14] = comm.Irecv(recvList_YZ, recvCount_YZ, rank_YZ, recvtag);
req1[15] = comm.Isend(sendList_YZ, sendCount_YZ, rank_YZ, sendtag);
req2[15] = comm.Irecv(recvList_yz, recvCount_yz, rank_yz, recvtag);
req1[16] = comm.Isend(sendList_Yz, sendCount_Yz, rank_Yz, sendtag);
req2[16] = comm.Irecv(recvList_yZ, recvCount_yZ, rank_yZ, recvtag);
req1[17] = comm.Isend(sendList_yZ, sendCount_yZ, rank_yZ, sendtag);
req2[17] = comm.Irecv(recvList_Yz, recvCount_Yz, rank_Yz, recvtag);
comm.waitAll(18, req1);
comm.waitAll(18, req2);
req1[14] = comm.Isend(sendList_yz,sendCount_yz,rank_yz,sendtag+14);
req2[14] = comm.Irecv(recvList_YZ,recvCount_YZ,rank_YZ,recvtag+14);
req1[15] = comm.Isend(sendList_YZ,sendCount_YZ,rank_YZ,sendtag+15);
req2[15] = comm.Irecv(recvList_yz,recvCount_yz,rank_yz,recvtag+15);
req1[16] = comm.Isend(sendList_Yz,sendCount_Yz,rank_Yz,sendtag+16);
req2[16] = comm.Irecv(recvList_yZ,recvCount_yZ,rank_yZ,recvtag+16);
req1[17] = comm.Isend(sendList_yZ,sendCount_yZ,rank_yZ,sendtag+17);
req2[17] = comm.Irecv(recvList_Yz,recvCount_Yz,rank_Yz,recvtag+17);
comm.waitAll( 18, req1 );
comm.waitAll( 18, req2 );
}
//***************************************************************************************

View File

@@ -40,6 +40,221 @@ static inline void fgetl(char *str, int num, FILE *stream) {
}
}
void Domain::read_swc(const std::string &Filename) {
//...... READ IN SWC FILE...................................
int count = 0;
int number_of_lines = 0;
if (rank() == 0){
cout << "Reading SWC file..." << endl;
{
std::string line;
std::ifstream myfile(Filename);
while (std::getline(myfile, line))
++number_of_lines;
number_of_lines -= 1;
}
std::cout << " Number of lines in SWC file: " << number_of_lines << endl;
}
count = Comm.sumReduce(number_of_lines); // nonzero only for rank=0
number_of_lines = count;
// set up structures to read
double *List_cx = new double [number_of_lines];
double *List_cy = new double [number_of_lines];
double *List_cz = new double [number_of_lines];
double *List_rad = new double [number_of_lines];
int *List_index = new int [number_of_lines];
int *List_parent = new int [number_of_lines];
int *List_type = new int [number_of_lines];
if (rank()==0){
FILE *fid = fopen(Filename.c_str(), "rb");
INSIST(fid != NULL, "Error opening SWC file");
//.........Trash the header lines (x 1)..........
char line[100];
fgetl(line, 100, fid);
//........read the spheres..................
// We will read until a blank like or end-of-file is reached
count = 0;
while (!feof(fid) && fgets(line, 100, fid) != NULL) {
char *line2 = line;
List_index[count] = int(strtod(line2, &line2));
List_type[count] = int(strtod(line2, &line2));
List_cx[count] = strtod(line2, &line2);
List_cy[count] = strtod(line2, &line2);
List_cz[count] = strtod(line2, &line2);
List_rad[count] = strtod(line2, &line2);
List_parent[count] = int(strtod(line2, &line2));
count++;
}
fclose( fid );
cout << " Number of lines extracted is: " << count << endl;
INSIST(count == number_of_lines, "Problem reading swc file!");
double min_cx = List_cx[0]-List_rad[0];
double min_cy = List_cy[0]-List_rad[0];
double min_cz = List_cz[0]-List_rad[0];
for (count=1; count<number_of_lines; count++){
double value_x = List_cx[count]-List_rad[count];
double value_y = List_cy[count]-List_rad[count];
double value_z = List_cz[count]-List_rad[count];
if (value_x < min_cx) min_cx = value_x;
if (value_y < min_cy) min_cy = value_y;
if (value_z < min_cz) min_cz = value_z;
}
/* shift the swc data */
printf(" shift swc data by %f, %f, %f \n",min_cx,min_cy, min_cz);
for (count=0; count<number_of_lines; count++){
List_cx[count] -= offset_x*voxel_length;
List_cy[count] -= offset_y*voxel_length;
List_cz[count] -= offset_z*voxel_length;
}
}
/* everybody gets the swc file */
Comm.bcast(List_cx,number_of_lines,0);
Comm.bcast(List_cy,number_of_lines,0);
Comm.bcast(List_cz,number_of_lines,0);
Comm.bcast(List_rad,number_of_lines,0);
Comm.bcast(List_index,number_of_lines,0);
Comm.bcast(List_parent,number_of_lines,0);
Comm.bcast(List_type,number_of_lines,0);
/* units of swc file are in micron */
double start_x, start_y, start_z;
/* box owned by this rank */
start_x = rank_info.ix*(Nx-2)*voxel_length;
start_y = rank_info.jy*(Ny-2)*voxel_length;
start_z = rank_info.kz*(Nz-2)*voxel_length;
//finish_x = (rank_info.ix+1)*(Nx-2)*voxel_length;
//finish_y = (rank_info.jy+1)*(Ny-2)*voxel_length;
//finish_z = (rank_info.kz+1)*(Nz-2)*voxel_length;
for (int k = 0; k < Nz; k++) {
for (int j = 0; j < Ny; j++) {
for (int i = 0; i < Nx; i++) {
id[k*Nx*Ny + j*Nx + i] = 1;
}
}
}
/* Loop over SWC input and populate domain ID */
for (int idx=0; idx<number_of_lines; idx++){
/* get the object information */
int parent = List_parent[idx]-1;
if (parent < 0) parent = idx;
double xi = List_cx[idx];
double yi = List_cy[idx];
double zi = List_cz[idx];
double xp = List_cx[parent];
double yp = List_cy[parent];
double zp = List_cz[parent];
double ri = List_rad[idx];
double rp = List_rad[parent];
int radius_in_voxels = int(List_rad[idx]/voxel_length);
signed char label = char(List_type[idx]);
double xmin = min(((xi - start_x - List_rad[idx])/voxel_length) ,((xp - start_x - List_rad[parent])/voxel_length) );
double ymin = min(((yi - start_y - List_rad[idx])/voxel_length) ,((yp - start_y - List_rad[parent])/voxel_length) );
double zmin = min(((zi - start_z - List_rad[idx])/voxel_length) ,((zp - start_z - List_rad[parent])/voxel_length) );
double xmax = max(((xi - start_x + List_rad[idx])/voxel_length) ,((xp - start_x + List_rad[parent])/voxel_length) );
double ymax = max(((yi - start_y + List_rad[idx])/voxel_length) ,((yp - start_y + List_rad[parent])/voxel_length) );
double zmax = max(((zi - start_z + List_rad[idx])/voxel_length) ,((zp - start_z + List_rad[parent])/voxel_length) );
/* if (rank()==1){
printf("%i %f %f %f %f\n",label,xi,yi,zi,ri);
printf("parent %i %f %f %f %f\n",parent,xp,yp,zp,rp);
}
*/
double length = sqrt((xi-xp)*(xi-xp) + (yi-yp)*(yi-yp) + (zi-zp)*(zi-zp) );
if (length == 0.0) length = 1.0;
double alpha = (xi - xp)/length;
double beta = (yi - yp)/length;
double gamma = (zi - zp)/length;
int start_idx = int(xmin);
int start_idy = int(ymin);
int start_idz = int(zmin);
int finish_idx = int(xmax);
int finish_idy = int(ymax);
int finish_idz = int(zmax);
/* get the little box to loop over
int start_idx = int((List_cx[idx] - List_rad[idx] - start_x)/voxel_length) + 1;
int start_idy = int((List_cy[idx] - List_rad[idx] - start_y)/voxel_length) + 1;
int start_idz = int((List_cz[idx] - List_rad[idx] - start_z)/voxel_length) + 1;
int finish_idx = int((List_cx[idx] + List_rad[idx] - start_x)/voxel_length) + 1;
int finish_idy = int((List_cy[idx] + List_rad[idx] - start_y)/voxel_length) + 1;
int finish_idz = int((List_cz[idx] + List_rad[idx] - start_z)/voxel_length) + 1;
*/
if (start_idx < 0 ) start_idx = 0;
if (start_idy < 0 ) start_idy = 0;
if (start_idz < 0 ) start_idz = 0;
if (start_idx > Nx-1 ) start_idx = Nx;
if (start_idy > Ny-1 ) start_idy = Ny;
if (start_idz > Nz-1 ) start_idz = Nz;
if (finish_idx < 0 ) finish_idx = 0;
if (finish_idy < 0 ) finish_idy = 0;
if (finish_idz < 0 ) finish_idz = 0;
if (finish_idx > Nx-1 ) finish_idx = Nx;
if (finish_idy > Ny-1 ) finish_idy = Ny;
if (finish_idz > Nz-1 ) finish_idz = Nz;
/* if (rank()==1) printf(" alpha = %f, beta = %f, gamma= %f\n",alpha, beta,gamma);
if (rank()==1) printf(" xi = %f, yi = %f, zi= %f, ri = %f \n",xi, yi, zi, ri);
if (rank()==1) printf(" xp = %f, yp = %f, zp= %f, rp = %f \n",xp, yp, zp, rp);
if (rank()==1) printf( "start: %i, %i, %i \n",start_idx,start_idy,start_idz);
if (rank()==1) printf( "finish: %i, %i, %i \n",finish_idx,finish_idy,finish_idz);
*/
for (int k = start_idz; k<finish_idz; k++){
for (int j = start_idy; j<finish_idy; j++){
for (int i = start_idx; i<finish_idx; i++){
double x = i*voxel_length + start_x;
double y = j*voxel_length + start_y;
double z = k*voxel_length + start_z;
double distance;
double s = ((x-xp)*alpha+(y-yp)*beta+(z-zp)*gamma) / (alpha*alpha + beta*beta + gamma*gamma);
double di = ri - sqrt((x-xi)*(x-xi) + (y-yi)*(y-yi) + (z-zi)*(z-zi));
double dp = rp - sqrt((x-xp)*(x-xp) + (y-yp)*(y-yp) + (z-zp)*(z-zp));
if (s > length ){
distance = di;
}
else if (s < 0.0){
distance = dp;
}
else {
// linear variation for radius
double radius = rp + (ri - rp)*s/length;
distance = radius - sqrt((x-xp-alpha*s)*(x-xp-alpha*s) + (y-yp-beta*s)*(y-yp-beta*s) + (z-zp-gamma*s)*(z-zp-gamma*s));
}
if (distance < di) distance = di;
if (distance < dp) distance = dp;
if ( distance > 0.0 ){
/* label the voxel */
//id[k*Nx*Ny + j*Nx + i] = label;
id[k*Nx*Ny + j*Nx + i] = 2;
}
}
}
}
//if (rank()==0) printf( "next line..\n");
}
delete[] List_cx;
delete[] List_cy;
delete[] List_cz;
delete[] List_rad;
delete[] List_index;
delete[] List_type;
delete[] List_parent;
}
/********************************************************
* Constructors *
********************************************************/
@@ -101,6 +316,7 @@ void Domain::initialize(std::shared_ptr<Database> db) {
int nx = n[0];
int ny = n[1];
int nz = n[2];
offset_x = offset_y = offset_z = 0;
if (d_db->keyExists("InletLayers")) {
auto InletCount = d_db->getVector<int>("InletLayers");
@@ -302,6 +518,9 @@ void Domain::Decomp(const std::string &Filename) {
xStart = offset[0];
yStart = offset[1];
zStart = offset[2];
offset_x = xStart;
offset_y = yStart;
offset_z = zStart;
}
if (database->keyExists("InletLayers")) {
auto InletCount = database->getVector<int>("InletLayers");
@@ -333,380 +552,391 @@ void Domain::Decomp(const std::string &Filename) {
if (ReadType == "8bit") {
} else if (ReadType == "16bit") {
} else if (ReadType == "swc") {
} else {
//printf("INPUT ERROR: Valid ReadType are 8bit, 16bit \n");
ReadType = "8bit";
}
nx = size[0];
ny = size[1];
nz = size[2];
nprocx = nproc[0];
nprocy = nproc[1];
nprocz = nproc[2];
global_Nx = SIZE[0];
global_Ny = SIZE[1];
global_Nz = SIZE[2];
nprocs = nprocx * nprocy * nprocz;
char *SegData = NULL;
if (RANK == 0) {
printf("Input media: %s\n", Filename.c_str());
printf("Relabeling %lu values\n", ReadValues.size());
for (size_t idx = 0; idx < ReadValues.size(); idx++) {
int oldvalue = ReadValues[idx];
int newvalue = WriteValues[idx];
printf("oldvalue=%d, newvalue =%d \n", oldvalue, newvalue);
}
// Rank=0 reads the entire segmented data and distributes to worker processes
printf("Dimensions of segmented image: %ld x %ld x %ld \n", global_Nx,
global_Ny, global_Nz);
int64_t SIZE = global_Nx * global_Ny * global_Nz;
SegData = new char[SIZE];
if (ReadType == "8bit") {
printf("Reading 8-bit input data \n");
FILE *SEGDAT = fopen(Filename.c_str(), "rb");
if (SEGDAT == NULL)
ERROR("Domain.cpp: Error reading segmented data");
size_t ReadSeg;
ReadSeg = fread(SegData, 1, SIZE, SEGDAT);
if (ReadSeg != size_t(SIZE))
printf("Domain.cpp: Error reading segmented data \n");
fclose(SEGDAT);
} else if (ReadType == "16bit") {
printf("Reading 16-bit input data \n");
short int *InputData;
InputData = new short int[SIZE];
FILE *SEGDAT = fopen(Filename.c_str(), "rb");
if (SEGDAT == NULL)
ERROR("Domain.cpp: Error reading segmented data");
size_t ReadSeg;
ReadSeg = fread(InputData, 2, SIZE, SEGDAT);
if (ReadSeg != size_t(SIZE))
printf("Domain.cpp: Error reading segmented data \n");
fclose(SEGDAT);
for (int n = 0; n < SIZE; n++) {
SegData[n] = char(InputData[n]);
}
}
printf("Read segmented data from %s \n", Filename.c_str());
// relabel the data
std::vector<long int> LabelCount(ReadValues.size(), 0);
for (int k = 0; k < global_Nz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
n = k * global_Nx * global_Ny + j * global_Nx + i;
//char locval = loc_id[n];
signed char locval = SegData[n];
for (size_t idx = 0; idx < ReadValues.size(); idx++) {
signed char oldvalue = ReadValues[idx];
signed char newvalue = WriteValues[idx];
if (locval == oldvalue) {
SegData[n] = newvalue;
LabelCount[idx]++;
idx = ReadValues.size();
}
}
}
}
}
for (size_t idx = 0; idx < ReadValues.size(); idx++) {
long int label = ReadValues[idx];
long int count = LabelCount[idx];
printf("Label=%ld, Count=%ld \n", label, count);
}
if (USE_CHECKER) {
if (inlet_layers_x > 0) {
// use checkerboard pattern
printf("Checkerboard pattern at x inlet for %i layers \n",
inlet_layers_x);
for (int k = 0; k < global_Nz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = xStart; i < xStart + inlet_layers_x; i++) {
if ((j / checkerSize + k / checkerSize) % 2 == 0) {
// void checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 2;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (inlet_layers_y > 0) {
printf("Checkerboard pattern at y inlet for %i layers \n",
inlet_layers_y);
// use checkerboard pattern
for (int k = 0; k < global_Nz; k++) {
for (int j = yStart; j < yStart + inlet_layers_y; j++) {
for (int i = 0; i < global_Nx; i++) {
if ((i / checkerSize + k / checkerSize) % 2 == 0) {
// void checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 2;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (inlet_layers_z > 0) {
printf("Checkerboard pattern at z inlet for %i layers, "
"saturated with phase label=%i \n",
inlet_layers_z, inlet_layers_phase);
// use checkerboard pattern
for (int k = zStart; k < zStart + inlet_layers_z; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
if ((i / checkerSize + j / checkerSize) % 2 == 0) {
// void checkers
//SegData[k*global_Nx*global_Ny+j*global_Nx+i] = 2;
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = inlet_layers_phase;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (outlet_layers_x > 0) {
// use checkerboard pattern
printf("Checkerboard pattern at x outlet for %i layers \n",
outlet_layers_x);
for (int k = 0; k < global_Nz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = xStart + nx * nprocx - outlet_layers_x;
i < xStart + nx * nprocx; i++) {
if ((j / checkerSize + k / checkerSize) % 2 == 0) {
// void checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 2;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (outlet_layers_y > 0) {
printf("Checkerboard pattern at y outlet for %i layers \n",
outlet_layers_y);
// use checkerboard pattern
for (int k = 0; k < global_Nz; k++) {
for (int j = yStart + ny * nprocy - outlet_layers_y;
j < yStart + ny * nprocy; j++) {
for (int i = 0; i < global_Nx; i++) {
if ((i / checkerSize + k / checkerSize) % 2 == 0) {
// void checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 2;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (outlet_layers_z > 0) {
printf("Checkerboard pattern at z outlet for %i layers, "
"saturated with phase label=%i \n",
outlet_layers_z, outlet_layers_phase);
// use checkerboard pattern
for (int k = zStart + nz * nprocz - outlet_layers_z;
k < zStart + nz * nprocz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
if ((i / checkerSize + j / checkerSize) % 2 == 0) {
// void checkers
//SegData[k*global_Nx*global_Ny+j*global_Nx+i] = 2;
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] =
outlet_layers_phase;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
} else {
if (inlet_layers_z > 0) {
printf("Mixed reflection pattern at z inlet for %i layers, "
"saturated with phase label=%i \n",
inlet_layers_z, inlet_layers_phase);
for (int k = zStart; k < zStart + inlet_layers_z; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
signed char local_id =
SegData[k * global_Nx * global_Ny +
j * global_Nx + i];
signed char reflection_id =
SegData[(zStart + nz * nprocz - 1) * global_Nx *
global_Ny +
j * global_Nx + i];
if (local_id < 1 && reflection_id > 0) {
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = reflection_id;
}
}
}
}
}
if (outlet_layers_z > 0) {
printf("Mixed reflection pattern at z outlet for %i layers, "
"saturated with phase label=%i \n",
outlet_layers_z, outlet_layers_phase);
for (int k = zStart + nz * nprocz - outlet_layers_z;
k < zStart + nz * nprocz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
signed char local_id =
SegData[k * global_Nx * global_Ny +
j * global_Nx + i];
signed char reflection_id =
SegData[zStart * global_Nx * global_Ny +
j * global_Nx + i];
if (local_id < 1 && reflection_id > 0) {
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = reflection_id;
}
}
}
}
}
}
/* swc format for neurons */
if (ReadType == "swc") {
read_swc(Filename);
}
else {
nx = size[0];
ny = size[1];
nz = size[2];
nprocx = nproc[0];
nprocy = nproc[1];
nprocz = nproc[2];
global_Nx = SIZE[0];
global_Ny = SIZE[1];
global_Nz = SIZE[2];
nprocs = nprocx * nprocy * nprocz;
char *SegData = NULL;
// Get the rank info
int64_t N = (nx + 2) * (ny + 2) * (nz + 2);
if (RANK == 0) {
printf("Input media: %s\n", Filename.c_str());
printf("Relabeling %lu values\n", ReadValues.size());
for (size_t idx = 0; idx < ReadValues.size(); idx++) {
int oldvalue = ReadValues[idx];
int newvalue = WriteValues[idx];
printf("oldvalue=%d, newvalue =%d \n", oldvalue, newvalue);
}
// number of sites to use for periodic boundary condition transition zone
int64_t z_transition_size = (nprocz * nz - (global_Nz - zStart)) / 2;
if (z_transition_size < 0)
z_transition_size = 0;
// Rank=0 reads the entire segmented data and distributes to worker processes
printf("Dimensions of segmented image: %ld x %ld x %ld \n", global_Nx,
global_Ny, global_Nz);
int64_t SIZE = global_Nx * global_Ny * global_Nz;
SegData = new char[SIZE];
if (ReadType == "8bit") {
printf("Reading 8-bit input data \n");
FILE *SEGDAT = fopen(Filename.c_str(), "rb");
if (SEGDAT == NULL)
ERROR("Domain.cpp: Error reading segmented data");
size_t ReadSeg;
ReadSeg = fread(SegData, 1, SIZE, SEGDAT);
if (ReadSeg != size_t(SIZE))
printf("Domain.cpp: Error reading segmented data \n");
fclose(SEGDAT);
} else if (ReadType == "16bit") {
printf("Reading 16-bit input data \n");
short int *InputData;
InputData = new short int[SIZE];
FILE *SEGDAT = fopen(Filename.c_str(), "rb");
if (SEGDAT == NULL)
ERROR("Domain.cpp: Error reading segmented data");
size_t ReadSeg;
ReadSeg = fread(InputData, 2, SIZE, SEGDAT);
if (ReadSeg != size_t(SIZE))
printf("Domain.cpp: Error reading segmented data \n");
fclose(SEGDAT);
for (int n = 0; n < SIZE; n++) {
SegData[n] = char(InputData[n]);
}
}
else if (ReadType == "SWC"){
// Set up the sub-domains
if (RANK == 0) {
printf("Distributing subdomains across %i processors \n", nprocs);
printf("Process grid: %i x %i x %i \n", nprocx, nprocy, nprocz);
printf("Subdomain size: %i x %i x %i \n", nx, ny, nz);
printf("Size of transition region: %ld \n", z_transition_size);
auto loc_id = new char[(nx + 2) * (ny + 2) * (nz + 2)];
for (int kp = 0; kp < nprocz; kp++) {
for (int jp = 0; jp < nprocy; jp++) {
for (int ip = 0; ip < nprocx; ip++) {
// rank of the process that gets this subdomain
int rnk = kp * nprocx * nprocy + jp * nprocx + ip;
// Pack and send the subdomain for rnk
for (k = 0; k < nz + 2; k++) {
for (j = 0; j < ny + 2; j++) {
for (i = 0; i < nx + 2; i++) {
int64_t x = xStart + ip * nx + i - 1;
int64_t y = yStart + jp * ny + j - 1;
// int64_t z = zStart + kp*nz + k-1;
int64_t z = zStart + kp * nz + k - 1 -
z_transition_size;
if (x < xStart)
x = xStart;
if (!(x < global_Nx))
x = global_Nx - 1;
if (y < yStart)
y = yStart;
if (!(y < global_Ny))
y = global_Ny - 1;
if (z < zStart)
z = zStart;
if (!(z < global_Nz))
z = global_Nz - 1;
int64_t nlocal =
k * (nx + 2) * (ny + 2) + j * (nx + 2) + i;
int64_t nglobal = z * global_Nx * global_Ny +
y * global_Nx + x;
loc_id[nlocal] = SegData[nglobal];
}
}
}
if (rnk == 0) {
for (k = 0; k < nz + 2; k++) {
for (j = 0; j < ny + 2; j++) {
for (i = 0; i < nx + 2; i++) {
int nlocal = k * (nx + 2) * (ny + 2) +
j * (nx + 2) + i;
id[nlocal] = loc_id[nlocal];
}
}
}
} else {
//printf("Sending data to process %i \n", rnk);
Comm.send(loc_id, N, rnk, 15);
}
// Write the data for this rank data
char LocalRankFilename[40];
sprintf(LocalRankFilename, "ID.%05i", rnk + rank_offset);
FILE *ID = fopen(LocalRankFilename, "wb");
fwrite(loc_id, 1, (nx + 2) * (ny + 2) * (nz + 2), ID);
fclose(ID);
}
}
}
delete[] loc_id;
} else {
// Recieve the subdomain from rank = 0
//printf("Ready to recieve data %i at process %i \n", N,rank);
Comm.recv(id.data(), N, 0, 15);
}
printf("Read segmented data from %s \n", Filename.c_str());
// relabel the data
std::vector<long int> LabelCount(ReadValues.size(), 0);
for (int k = 0; k < global_Nz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
n = k * global_Nx * global_Ny + j * global_Nx + i;
//char locval = loc_id[n];
signed char locval = SegData[n];
for (size_t idx = 0; idx < ReadValues.size(); idx++) {
signed char oldvalue = ReadValues[idx];
signed char newvalue = WriteValues[idx];
if (locval == oldvalue) {
SegData[n] = newvalue;
LabelCount[idx]++;
idx = ReadValues.size();
}
}
}
}
}
for (size_t idx = 0; idx < ReadValues.size(); idx++) {
long int label = ReadValues[idx];
long int count = LabelCount[idx];
printf("Label=%ld, Count=%ld \n", label, count);
}
if (USE_CHECKER) {
if (inlet_layers_x > 0) {
// use checkerboard pattern
printf("Checkerboard pattern at x inlet for %i layers \n",
inlet_layers_x);
for (int k = 0; k < global_Nz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = xStart; i < xStart + inlet_layers_x; i++) {
if ((j / checkerSize + k / checkerSize) % 2 == 0) {
// void checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 2;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (inlet_layers_y > 0) {
printf("Checkerboard pattern at y inlet for %i layers \n",
inlet_layers_y);
// use checkerboard pattern
for (int k = 0; k < global_Nz; k++) {
for (int j = yStart; j < yStart + inlet_layers_y; j++) {
for (int i = 0; i < global_Nx; i++) {
if ((i / checkerSize + k / checkerSize) % 2 == 0) {
// void checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 2;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (inlet_layers_z > 0) {
printf("Checkerboard pattern at z inlet for %i layers, "
"saturated with phase label=%i \n",
inlet_layers_z, inlet_layers_phase);
// use checkerboard pattern
for (int k = zStart; k < zStart + inlet_layers_z; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
if ((i / checkerSize + j / checkerSize) % 2 == 0) {
// void checkers
//SegData[k*global_Nx*global_Ny+j*global_Nx+i] = 2;
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = inlet_layers_phase;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (outlet_layers_x > 0) {
// use checkerboard pattern
printf("Checkerboard pattern at x outlet for %i layers \n",
outlet_layers_x);
for (int k = 0; k < global_Nz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = xStart + nx * nprocx - outlet_layers_x;
i < xStart + nx * nprocx; i++) {
if ((j / checkerSize + k / checkerSize) % 2 == 0) {
// void checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 2;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (outlet_layers_y > 0) {
printf("Checkerboard pattern at y outlet for %i layers \n",
outlet_layers_y);
// use checkerboard pattern
for (int k = 0; k < global_Nz; k++) {
for (int j = yStart + ny * nprocy - outlet_layers_y;
j < yStart + ny * nprocy; j++) {
for (int i = 0; i < global_Nx; i++) {
if ((i / checkerSize + k / checkerSize) % 2 == 0) {
// void checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 2;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
if (outlet_layers_z > 0) {
printf("Checkerboard pattern at z outlet for %i layers, "
"saturated with phase label=%i \n",
outlet_layers_z, outlet_layers_phase);
// use checkerboard pattern
for (int k = zStart + nz * nprocz - outlet_layers_z;
k < zStart + nz * nprocz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
if ((i / checkerSize + j / checkerSize) % 2 == 0) {
// void checkers
//SegData[k*global_Nx*global_Ny+j*global_Nx+i] = 2;
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] =
outlet_layers_phase;
} else {
// solid checkers
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = 0;
}
}
}
}
}
} else {
if (inlet_layers_z > 0) {
printf("Mixed reflection pattern at z inlet for %i layers, "
"saturated with phase label=%i \n",
inlet_layers_z, inlet_layers_phase);
for (int k = zStart; k < zStart + inlet_layers_z; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
signed char local_id =
SegData[k * global_Nx * global_Ny +
j * global_Nx + i];
signed char reflection_id =
SegData[(zStart + nz * nprocz - 1) * global_Nx *
global_Ny +
j * global_Nx + i];
if (local_id < 1 && reflection_id > 0) {
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = reflection_id;
}
}
}
}
}
if (outlet_layers_z > 0) {
printf("Mixed reflection pattern at z outlet for %i layers, "
"saturated with phase label=%i \n",
outlet_layers_z, outlet_layers_phase);
for (int k = zStart + nz * nprocz - outlet_layers_z;
k < zStart + nz * nprocz; k++) {
for (int j = 0; j < global_Ny; j++) {
for (int i = 0; i < global_Nx; i++) {
signed char local_id =
SegData[k * global_Nx * global_Ny +
j * global_Nx + i];
signed char reflection_id =
SegData[zStart * global_Nx * global_Ny +
j * global_Nx + i];
if (local_id < 1 && reflection_id > 0) {
SegData[k * global_Nx * global_Ny +
j * global_Nx + i] = reflection_id;
}
}
}
}
}
}
}
// Get the rank info
int64_t N = (nx + 2) * (ny + 2) * (nz + 2);
// number of sites to use for periodic boundary condition transition zone
int64_t z_transition_size = (nprocz * nz - (global_Nz - zStart)) / 2;
if (z_transition_size < 0)
z_transition_size = 0;
// Set up the sub-domains
if (RANK == 0) {
printf("Distributing subdomains across %i processors \n", nprocs);
printf("Process grid: %i x %i x %i \n", nprocx, nprocy, nprocz);
printf("Subdomain size: %i x %i x %i \n", nx, ny, nz);
printf("Size of transition region: %ld \n", z_transition_size);
auto loc_id = new char[(nx + 2) * (ny + 2) * (nz + 2)];
for (int kp = 0; kp < nprocz; kp++) {
for (int jp = 0; jp < nprocy; jp++) {
for (int ip = 0; ip < nprocx; ip++) {
// rank of the process that gets this subdomain
int rnk = kp * nprocx * nprocy + jp * nprocx + ip;
// Pack and send the subdomain for rnk
for (k = 0; k < nz + 2; k++) {
for (j = 0; j < ny + 2; j++) {
for (i = 0; i < nx + 2; i++) {
int64_t x = xStart + ip * nx + i - 1;
int64_t y = yStart + jp * ny + j - 1;
// int64_t z = zStart + kp*nz + k-1;
int64_t z = zStart + kp * nz + k - 1 -
z_transition_size;
if (x < xStart)
x = xStart;
if (!(x < global_Nx))
x = global_Nx - 1;
if (y < yStart)
y = yStart;
if (!(y < global_Ny))
y = global_Ny - 1;
if (z < zStart)
z = zStart;
if (!(z < global_Nz))
z = global_Nz - 1;
int64_t nlocal =
k * (nx + 2) * (ny + 2) + j * (nx + 2) + i;
int64_t nglobal = z * global_Nx * global_Ny +
y * global_Nx + x;
loc_id[nlocal] = SegData[nglobal];
}
}
}
if (rnk == 0) {
for (k = 0; k < nz + 2; k++) {
for (j = 0; j < ny + 2; j++) {
for (i = 0; i < nx + 2; i++) {
int nlocal = k * (nx + 2) * (ny + 2) +
j * (nx + 2) + i;
id[nlocal] = loc_id[nlocal];
}
}
}
} else {
//printf("Sending data to process %i \n", rnk);
Comm.send(loc_id, N, rnk, 15);
}
// Write the data for this rank data
char LocalRankFilename[40];
sprintf(LocalRankFilename, "ID.%05i", rnk + rank_offset);
FILE *ID = fopen(LocalRankFilename, "wb");
fwrite(loc_id, 1, (nx + 2) * (ny + 2) * (nz + 2), ID);
fclose(ID);
}
}
}
delete[] loc_id;
} else {
// Recieve the subdomain from rank = 0
//printf("Ready to recieve data %i at process %i \n", N,rank);
Comm.recv(id.data(), N, 0, 15);
}
delete[] SegData;
}
Comm.barrier();
ComputePorosity();
delete[] SegData;
}
void Domain::ComputePorosity() {
// Compute the porosity
double sum;
double sum_local = 0.0;
double iVol_global = 1.0 / (1.0 * (Nx - 2) * (Ny - 2) * (Nz - 2) *
nprocx() * nprocy() * nprocz());
if (BoundaryCondition > 0 && BoundaryCondition != 5)
iVol_global =
1.0 / (1.0 * (Nx - 2) * nprocx() * (Ny - 2) * nprocy() *
((Nz - 2) * nprocz() - inlet_layers_z - outlet_layers_z));
//.........................................................
for (int k = inlet_layers_z + 1; k < Nz - outlet_layers_z - 1; k++) {
for (int j = 1; j < Ny - 1; j++) {
for (int i = 1; i < Nx - 1; i++) {
int n = k * Nx * Ny + j * Nx + i;
if (id[n] > 0) {
sum_local += 1.0;
}
}
}
}
sum = Comm.sumReduce(sum_local);
porosity = sum * iVol_global;
if (rank() == 0)
printf("Media porosity = %f \n", porosity);
//.........................................................
// Compute the porosity
double sum;
double sum_local = 0.0;
double iVol_global = 1.0 / (1.0 * (Nx - 2) * (Ny - 2) * (Nz - 2) *
nprocx() * nprocy() * nprocz());
if (BoundaryCondition > 0 && BoundaryCondition != 5)
iVol_global =
1.0 / (1.0 * (Nx - 2) * nprocx() * (Ny - 2) * nprocy() *
((Nz - 2) * nprocz() - inlet_layers_z - outlet_layers_z));
//.........................................................
for (int k = inlet_layers_z + 1; k < Nz - outlet_layers_z - 1; k++) {
for (int j = 1; j < Ny - 1; j++) {
for (int i = 1; i < Nx - 1; i++) {
int n = k * Nx * Ny + j * Nx + i;
if (id[n] > 0) {
sum_local += 1.0;
}
}
}
}
sum = Comm.sumReduce(sum_local);
porosity = sum * iVol_global;
if (rank() == 0)
printf("Media porosity = %f \n", porosity);
//.........................................................
}
void Domain::AggregateLabels(const std::string &filename) {
@@ -1543,7 +1773,7 @@ void Domain::ReadFromFile(const std::string &Filename,
} else {
// Recieve the subdomain from rank = 0
//printf("Ready to recieve data %i at process %i \n", N,rank);
Comm.recv(id.data(), N, 0, 15);
Comm.recv(UserData, N, 0, 15);
}
Comm.barrier();
}

View File

@@ -134,6 +134,7 @@ public: // Public variables (need to create accessors instead)
int Nx, Ny, Nz, N;
int inlet_layers_x, inlet_layers_y, inlet_layers_z;
int outlet_layers_x, outlet_layers_y, outlet_layers_z;
int offset_x, offset_y, offset_z;
int inlet_layers_phase; //as usual: 1->n, 2->w
int outlet_layers_phase;
double porosity;
@@ -202,6 +203,11 @@ public: // Public variables (need to create accessors instead)
* \brief Read domain IDs from file
*/
void ReadIDs();
/**
* \brief Read domain IDs from SWC file
*/
void read_swc(const std::string &Filename);
/**
* \brief Compute the porosity

View File

@@ -93,12 +93,11 @@ template<> long double genRand<long double>()
* axpy *
********************************************************/
template <>
void call_axpy<float>(size_t N, const float alpha, const float *x, float *y) {
void call_axpy<float>(size_t, const float, const float*, float*) {
ERROR("Not finished");
}
template <>
void call_axpy<double>(size_t N, const double alpha, const double *x,
double *y) {
void call_axpy<double>(size_t, const double, const double*, double*) {
ERROR("Not finished");
}
@@ -106,22 +105,22 @@ void call_axpy<double>(size_t N, const double alpha, const double *x,
* Multiply two arrays *
********************************************************/
template <>
void call_gemv<double>(size_t M, size_t N, double alpha, double beta,
const double *A, const double *x, double *y) {
void call_gemv<double>(size_t, size_t, double, double,
const double*, const double*, double*) {
ERROR("Not finished");
}
template <>
void call_gemv<float>(size_t M, size_t N, float alpha, float beta,
const float *A, const float *x, float *y) {
void call_gemv<float>(size_t, size_t, float, float,
const float*, const float*, float*) {
ERROR("Not finished");
}
template <>
void call_gemm<double>(size_t M, size_t N, size_t K, double alpha, double beta,
const double *A, const double *B, double *C) {
void call_gemm<double>(size_t, size_t, size_t, double, double,
const double*, const double*, double*) {
ERROR("Not finished");
}
template <>
void call_gemm<float>(size_t M, size_t N, size_t K, float alpha, float beta,
const float *A, const float *B, float *C) {
void call_gemm<float>(size_t, size_t, size_t, float, float,
const float*, const float*, float*) {
ERROR("Not finished");
}

View File

@@ -297,10 +297,10 @@ TYPE FunctionTable::sum(const Array<TYPE, FUN, ALLOC> &A) {
}
template <class TYPE>
inline void FunctionTable::gemmWrapper(char TRANSA, char TRANSB, int M, int N,
int K, TYPE alpha, const TYPE *A,
int LDA, const TYPE *B, int LDB,
TYPE beta, TYPE *C, int LDC) {
inline void FunctionTable::gemmWrapper(char, char, int, int,
int, TYPE, const TYPE*,
int, const TYPE*, int,
TYPE, TYPE*, int) {
ERROR("Not finished");
}

File diff suppressed because it is too large Load Diff

View File

@@ -1115,15 +1115,14 @@ bool MPI_CLASS::anyReduce(const bool value) const {
template <>
void MPI_CLASS::call_sumReduce<unsigned char>(const unsigned char *send,
unsigned char *recv,
const int n) const {
int n) const {
PROFILE_START("sumReduce1<unsigned char>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_UNSIGNED_CHAR, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<unsigned char>", profile_level);
}
template <>
void MPI_CLASS::call_sumReduce<unsigned char>(unsigned char *x,
const int n) const {
void MPI_CLASS::call_sumReduce<unsigned char>(unsigned char *x, int n) const {
PROFILE_START("sumReduce2<unsigned char>", profile_level);
auto send = x;
auto recv = new unsigned char[n];
@@ -1136,13 +1135,13 @@ void MPI_CLASS::call_sumReduce<unsigned char>(unsigned char *x,
// char
template <>
void MPI_CLASS::call_sumReduce<char>(const char *send, char *recv,
const int n) const {
int n) const {
PROFILE_START("sumReduce1<char>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_SIGNED_CHAR, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<char>", profile_level);
}
template <> void MPI_CLASS::call_sumReduce<char>(char *x, const int n) const {
template <> void MPI_CLASS::call_sumReduce<char>(char *x, int n) const {
PROFILE_START("sumReduce2<char>", profile_level);
auto send = x;
auto recv = new char[n];
@@ -1155,16 +1154,14 @@ template <> void MPI_CLASS::call_sumReduce<char>(char *x, const int n) const {
// unsigned int
template <>
void MPI_CLASS::call_sumReduce<unsigned int>(const unsigned int *send,
unsigned int *recv,
const int n) const {
unsigned int *recv, int n) const {
PROFILE_START("sumReduce1<unsigned int>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_UNSIGNED, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<unsigned int>", profile_level);
}
template <>
void MPI_CLASS::call_sumReduce<unsigned int>(unsigned int *x,
const int n) const {
void MPI_CLASS::call_sumReduce<unsigned int>(unsigned int *x, int n) const {
PROFILE_START("sumReduce2<unsigned int>", profile_level);
auto send = x;
auto recv = new unsigned int[n];
@@ -1176,14 +1173,13 @@ void MPI_CLASS::call_sumReduce<unsigned int>(unsigned int *x,
}
// int
template <>
void MPI_CLASS::call_sumReduce<int>(const int *send, int *recv,
const int n) const {
void MPI_CLASS::call_sumReduce<int>(const int *send, int *recv, int n) const {
PROFILE_START("sumReduce1<int>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_INT, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<int>", profile_level);
}
template <> void MPI_CLASS::call_sumReduce<int>(int *x, const int n) const {
template <> void MPI_CLASS::call_sumReduce<int>(int *x, int n) const {
PROFILE_START("sumReduce2<int>", profile_level);
auto send = x;
auto recv = new int[n];
@@ -1196,14 +1192,13 @@ template <> void MPI_CLASS::call_sumReduce<int>(int *x, const int n) const {
// long int
template <>
void MPI_CLASS::call_sumReduce<long int>(const long int *send, long int *recv,
const int n) const {
int n) const {
PROFILE_START("sumReduce1<long int>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_LONG, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<long int>", profile_level);
}
template <>
void MPI_CLASS::call_sumReduce<long int>(long int *x, const int n) const {
template <> void MPI_CLASS::call_sumReduce<long int>(long int *x, int n) const {
PROFILE_START("sumReduce2<long int>", profile_level);
auto send = x;
auto recv = new long int[n];
@@ -1217,15 +1212,14 @@ void MPI_CLASS::call_sumReduce<long int>(long int *x, const int n) const {
template <>
void MPI_CLASS::call_sumReduce<unsigned long>(const unsigned long *send,
unsigned long *recv,
const int n) const {
int n) const {
PROFILE_START("sumReduce1<unsigned long>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_UNSIGNED_LONG, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<unsigned long>", profile_level);
}
template <>
void MPI_CLASS::call_sumReduce<unsigned long>(unsigned long *x,
const int n) const {
void MPI_CLASS::call_sumReduce<unsigned long>(unsigned long *x, int n) const {
PROFILE_START("sumReduce2<unsigned long>", profile_level);
auto send = x;
auto recv = new unsigned long int[n];
@@ -1239,15 +1233,14 @@ void MPI_CLASS::call_sumReduce<unsigned long>(unsigned long *x,
#ifdef USE_WINDOWS
template <>
void MPI_CLASS::call_sumReduce<size_t>(const size_t *send, size_t *recv,
const int n) const {
int n) const {
MPI_ASSERT(MPI_SIZE_T != 0);
PROFILE_START("sumReduce1<size_t>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_SIZE_T, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<size_t>", profile_level);
}
template <>
void MPI_CLASS::call_sumReduce<size_t>(size_t *x, const int n) const {
template <> void MPI_CLASS::call_sumReduce<size_t>(size_t *x, int n) const {
MPI_ASSERT(MPI_SIZE_T != 0);
PROFILE_START("sumReduce2<size_t>", profile_level);
auto send = x;
@@ -1263,13 +1256,13 @@ void MPI_CLASS::call_sumReduce<size_t>(size_t *x, const int n) const {
// float
template <>
void MPI_CLASS::call_sumReduce<float>(const float *send, float *recv,
const int n) const {
int n) const {
PROFILE_START("sumReduce1<float>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_FLOAT, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<float>", profile_level);
}
template <> void MPI_CLASS::call_sumReduce<float>(float *x, const int n) const {
template <> void MPI_CLASS::call_sumReduce<float>(float *x, int n) const {
PROFILE_START("sumReduce2<float>", profile_level);
auto send = x;
auto recv = new float[n];
@@ -1282,14 +1275,13 @@ template <> void MPI_CLASS::call_sumReduce<float>(float *x, const int n) const {
// double
template <>
void MPI_CLASS::call_sumReduce<double>(const double *send, double *recv,
const int n) const {
int n) const {
PROFILE_START("sumReduce1<double>", profile_level);
MPI_Allreduce((void *)send, (void *)recv, n, MPI_DOUBLE, MPI_SUM,
communicator);
PROFILE_STOP("sumReduce1<double>", profile_level);
}
template <>
void MPI_CLASS::call_sumReduce<double>(double *x, const int n) const {
template <> void MPI_CLASS::call_sumReduce<double>(double *x, int n) const {
PROFILE_START("sumReduce2<double>", profile_level);
auto send = x;
auto recv = new double[n];
@@ -1302,7 +1294,7 @@ void MPI_CLASS::call_sumReduce<double>(double *x, const int n) const {
// std::complex<double>
template <>
void MPI_CLASS::call_sumReduce<std::complex<double>>(
const std::complex<double> *x, std::complex<double> *y, const int n) const {
const std::complex<double> *x, std::complex<double> *y, int n) const {
PROFILE_START("sumReduce1<complex double>", profile_level);
auto send = new double[2 * n];
auto recv = new double[2 * n];
@@ -1320,7 +1312,7 @@ void MPI_CLASS::call_sumReduce<std::complex<double>>(
}
template <>
void MPI_CLASS::call_sumReduce<std::complex<double>>(std::complex<double> *x,
const int n) const {
int n) const {
PROFILE_START("sumReduce2<complex double>", profile_level);
auto send = new double[2 * n];
auto recv = new double[2 * n];
@@ -1345,7 +1337,7 @@ void MPI_CLASS::call_sumReduce<std::complex<double>>(std::complex<double> *x,
// unsigned char
template <>
void MPI_CLASS::call_minReduce<unsigned char>(const unsigned char *send,
unsigned char *recv, const int n,
unsigned char *recv, int n,
int *comm_rank_of_min) const {
if (comm_rank_of_min == nullptr) {
PROFILE_START("minReduce1<unsigned char>", profile_level);
@@ -1363,7 +1355,7 @@ void MPI_CLASS::call_minReduce<unsigned char>(const unsigned char *send,
}
}
template <>
void MPI_CLASS::call_minReduce<unsigned char>(unsigned char *x, const int n,
void MPI_CLASS::call_minReduce<unsigned char>(unsigned char *x, int n,
int *comm_rank_of_min) const {
if (comm_rank_of_min == nullptr) {
PROFILE_START("minReduce2<unsigned char>", profile_level);
@@ -1386,7 +1378,7 @@ void MPI_CLASS::call_minReduce<unsigned char>(unsigned char *x, const int n,
}
// char
template <>
void MPI_CLASS::call_minReduce<char>(const char *send, char *recv, const int n,
void MPI_CLASS::call_minReduce<char>(const char *send, char *recv, int n,
int *comm_rank_of_min) const {
if (comm_rank_of_min == nullptr) {
PROFILE_START("minReduce1<char>", profile_level);
@@ -1404,7 +1396,7 @@ void MPI_CLASS::call_minReduce<char>(const char *send, char *recv, const int n,
}
}
template <>
void MPI_CLASS::call_minReduce<char>(char *x, const int n,
void MPI_CLASS::call_minReduce<char>(char *x, int n,
int *comm_rank_of_min) const {
if (comm_rank_of_min == nullptr) {
PROFILE_START("minReduce2<char>", profile_level);
@@ -1428,7 +1420,7 @@ void MPI_CLASS::call_minReduce<char>(char *x, const int n,
// unsigned int
template <>
void MPI_CLASS::call_minReduce<unsigned int>(const unsigned int *send,
unsigned int *recv, const int n,
unsigned int *recv, int n,
int *comm_rank_of_min) const {
if (comm_rank_of_min == nullptr) {
PROFILE_START("minReduce1<unsigned int>", profile_level);
@@ -1446,7 +1438,7 @@ void MPI_CLASS::call_minReduce<unsigned int>(const unsigned int *send,
}
}
template <>
void MPI_CLASS::call_minReduce<unsigned int>(unsigned int *x, const int n,
void MPI_CLASS::call_minReduce<unsigned int>(unsigned int *x, int n,
int *comm_rank_of_min) const {
if (comm_rank_of_min == nullptr) {
PROFILE_START("minReduce2<unsigned int>", profile_level);
@@ -1469,7 +1461,7 @@ void MPI_CLASS::call_minReduce<unsigned int>(unsigned int *x, const int n,
}
// int
template <>
void MPI_CLASS::call_minReduce<int>(const int *x, int *y, const int n,
void MPI_CLASS::call_minReduce<int>(const int *x, int *y, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce1<int>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1492,7 +1484,7 @@ void MPI_CLASS::call_minReduce<int>(const int *x, int *y, const int n,
PROFILE_STOP("minReduce1<int>", profile_level);
}
template <>
void MPI_CLASS::call_minReduce<int>(int *x, const int n,
void MPI_CLASS::call_minReduce<int>(int *x, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce2<int>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1523,7 +1515,7 @@ void MPI_CLASS::call_minReduce<int>(int *x, const int n,
template <>
void MPI_CLASS::call_minReduce<unsigned long int>(const unsigned long int *send,
unsigned long int *recv,
const int n,
int n,
int *comm_rank_of_min) const {
if (comm_rank_of_min == nullptr) {
PROFILE_START("minReduce1<unsigned long>", profile_level);
@@ -1541,8 +1533,7 @@ void MPI_CLASS::call_minReduce<unsigned long int>(const unsigned long int *send,
}
}
template <>
void MPI_CLASS::call_minReduce<unsigned long int>(unsigned long int *x,
const int n,
void MPI_CLASS::call_minReduce<unsigned long int>(unsigned long int *x, int n,
int *comm_rank_of_min) const {
if (comm_rank_of_min == nullptr) {
PROFILE_START("minReduce2<unsigned long>", profile_level);
@@ -1565,8 +1556,7 @@ void MPI_CLASS::call_minReduce<unsigned long int>(unsigned long int *x,
}
// long int
template <>
void MPI_CLASS::call_minReduce<long int>(const long int *x, long int *y,
const int n,
void MPI_CLASS::call_minReduce<long int>(const long int *x, long int *y, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce1<long int>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1589,7 +1579,7 @@ void MPI_CLASS::call_minReduce<long int>(const long int *x, long int *y,
PROFILE_STOP("minReduce1<long int>", profile_level);
}
template <>
void MPI_CLASS::call_minReduce<long int>(long int *x, const int n,
void MPI_CLASS::call_minReduce<long int>(long int *x, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce2<long int>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1619,8 +1609,8 @@ void MPI_CLASS::call_minReduce<long int>(long int *x, const int n,
// unsigned long long int
template <>
void MPI_CLASS::call_minReduce<unsigned long long int>(
const unsigned long long int *send, unsigned long long int *recv,
const int n, int *comm_rank_of_min) const {
const unsigned long long int *send, unsigned long long int *recv, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce1<long int>", profile_level);
if (comm_rank_of_min == nullptr) {
auto x = new long long int[n];
@@ -1647,7 +1637,7 @@ void MPI_CLASS::call_minReduce<unsigned long long int>(
}
template <>
void MPI_CLASS::call_minReduce<unsigned long long int>(
unsigned long long int *x, const int n, int *comm_rank_of_min) const {
unsigned long long int *x, int n, int *comm_rank_of_min) const {
auto recv = new unsigned long long int[n];
call_minReduce<unsigned long long int>(x, recv, n, comm_rank_of_min);
for (int i = 0; i < n; i++)
@@ -1657,7 +1647,7 @@ void MPI_CLASS::call_minReduce<unsigned long long int>(
// long long int
template <>
void MPI_CLASS::call_minReduce<long long int>(const long long int *x,
long long int *y, const int n,
long long int *y, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce1<long int>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1676,7 +1666,7 @@ void MPI_CLASS::call_minReduce<long long int>(const long long int *x,
PROFILE_STOP("minReduce1<long int>", profile_level);
}
template <>
void MPI_CLASS::call_minReduce<long long int>(long long int *x, const int n,
void MPI_CLASS::call_minReduce<long long int>(long long int *x, int n,
int *comm_rank_of_min) const {
auto recv = new long long int[n];
call_minReduce<long long int>(x, recv, n, comm_rank_of_min);
@@ -1686,7 +1676,7 @@ void MPI_CLASS::call_minReduce<long long int>(long long int *x, const int n,
}
// float
template <>
void MPI_CLASS::call_minReduce<float>(const float *x, float *y, const int n,
void MPI_CLASS::call_minReduce<float>(const float *x, float *y, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce1<float>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1709,7 +1699,7 @@ void MPI_CLASS::call_minReduce<float>(const float *x, float *y, const int n,
PROFILE_STOP("minReduce1<float>", profile_level);
}
template <>
void MPI_CLASS::call_minReduce<float>(float *x, const int n,
void MPI_CLASS::call_minReduce<float>(float *x, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce2<float>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1738,7 +1728,7 @@ void MPI_CLASS::call_minReduce<float>(float *x, const int n,
}
// double
template <>
void MPI_CLASS::call_minReduce<double>(const double *x, double *y, const int n,
void MPI_CLASS::call_minReduce<double>(const double *x, double *y, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce1<double>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1762,7 +1752,7 @@ void MPI_CLASS::call_minReduce<double>(const double *x, double *y, const int n,
PROFILE_STOP("minReduce1<double>", profile_level);
}
template <>
void MPI_CLASS::call_minReduce<double>(double *x, const int n,
void MPI_CLASS::call_minReduce<double>(double *x, int n,
int *comm_rank_of_min) const {
PROFILE_START("minReduce2<double>", profile_level);
if (comm_rank_of_min == nullptr) {
@@ -1799,7 +1789,7 @@ void MPI_CLASS::call_minReduce<double>(double *x, const int n,
// unsigned char
template <>
void MPI_CLASS::call_maxReduce<unsigned char>(const unsigned char *send,
unsigned char *recv, const int n,
unsigned char *recv, int n,
int *comm_rank_of_max) const {
if (comm_rank_of_max == nullptr) {
PROFILE_START("maxReduce1<unsigned char>", profile_level);
@@ -1817,7 +1807,7 @@ void MPI_CLASS::call_maxReduce<unsigned char>(const unsigned char *send,
}
}
template <>
void MPI_CLASS::call_maxReduce<unsigned char>(unsigned char *x, const int n,
void MPI_CLASS::call_maxReduce<unsigned char>(unsigned char *x, int n,
int *comm_rank_of_max) const {
if (comm_rank_of_max == nullptr) {
PROFILE_START("maxReduce2<unsigned char>", profile_level);
@@ -1840,7 +1830,7 @@ void MPI_CLASS::call_maxReduce<unsigned char>(unsigned char *x, const int n,
}
// char
template <>
void MPI_CLASS::call_maxReduce<char>(const char *send, char *recv, const int n,
void MPI_CLASS::call_maxReduce<char>(const char *send, char *recv, int n,
int *comm_rank_of_max) const {
if (comm_rank_of_max == nullptr) {
PROFILE_START("maxReduce1<char>", profile_level);
@@ -1858,7 +1848,7 @@ void MPI_CLASS::call_maxReduce<char>(const char *send, char *recv, const int n,
}
}
template <>
void MPI_CLASS::call_maxReduce<char>(char *x, const int n,
void MPI_CLASS::call_maxReduce<char>(char *x, int n,
int *comm_rank_of_max) const {
if (comm_rank_of_max == nullptr) {
PROFILE_START("maxReduce2<char>", profile_level);
@@ -1882,7 +1872,7 @@ void MPI_CLASS::call_maxReduce<char>(char *x, const int n,
// unsigned int
template <>
void MPI_CLASS::call_maxReduce<unsigned int>(const unsigned int *send,
unsigned int *recv, const int n,
unsigned int *recv, int n,
int *comm_rank_of_max) const {
if (comm_rank_of_max == nullptr) {
PROFILE_START("maxReduce1<unsigned int>", profile_level);
@@ -1900,7 +1890,7 @@ void MPI_CLASS::call_maxReduce<unsigned int>(const unsigned int *send,
}
}
template <>
void MPI_CLASS::call_maxReduce<unsigned int>(unsigned int *x, const int n,
void MPI_CLASS::call_maxReduce<unsigned int>(unsigned int *x, int n,
int *comm_rank_of_max) const {
if (comm_rank_of_max == nullptr) {
PROFILE_START("maxReduce2<unsigned int>", profile_level);
@@ -1923,7 +1913,7 @@ void MPI_CLASS::call_maxReduce<unsigned int>(unsigned int *x, const int n,
}
// int
template <>
void MPI_CLASS::call_maxReduce<int>(const int *x, int *y, const int n,
void MPI_CLASS::call_maxReduce<int>(const int *x, int *y, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce1<int>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -1946,7 +1936,7 @@ void MPI_CLASS::call_maxReduce<int>(const int *x, int *y, const int n,
PROFILE_STOP("maxReduce1<int>", profile_level);
}
template <>
void MPI_CLASS::call_maxReduce<int>(int *x, const int n,
void MPI_CLASS::call_maxReduce<int>(int *x, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce2<int>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -1975,8 +1965,7 @@ void MPI_CLASS::call_maxReduce<int>(int *x, const int n,
}
// long int
template <>
void MPI_CLASS::call_maxReduce<long int>(const long int *x, long int *y,
const int n,
void MPI_CLASS::call_maxReduce<long int>(const long int *x, long int *y, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce1<lond int>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -1999,7 +1988,7 @@ void MPI_CLASS::call_maxReduce<long int>(const long int *x, long int *y,
PROFILE_STOP("maxReduce1<lond int>", profile_level);
}
template <>
void MPI_CLASS::call_maxReduce<long int>(long int *x, const int n,
void MPI_CLASS::call_maxReduce<long int>(long int *x, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce2<lond int>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -2030,7 +2019,7 @@ void MPI_CLASS::call_maxReduce<long int>(long int *x, const int n,
template <>
void MPI_CLASS::call_maxReduce<unsigned long int>(const unsigned long int *send,
unsigned long int *recv,
const int n,
int n,
int *comm_rank_of_max) const {
if (comm_rank_of_max == nullptr) {
PROFILE_START("maxReduce1<unsigned long>", profile_level);
@@ -2048,8 +2037,7 @@ void MPI_CLASS::call_maxReduce<unsigned long int>(const unsigned long int *send,
}
}
template <>
void MPI_CLASS::call_maxReduce<unsigned long int>(unsigned long int *x,
const int n,
void MPI_CLASS::call_maxReduce<unsigned long int>(unsigned long int *x, int n,
int *comm_rank_of_max) const {
if (comm_rank_of_max == nullptr) {
PROFILE_START("maxReduce2<unsigned long>", profile_level);
@@ -2073,8 +2061,8 @@ void MPI_CLASS::call_maxReduce<unsigned long int>(unsigned long int *x,
// unsigned long long int
template <>
void MPI_CLASS::call_maxReduce<unsigned long long int>(
const unsigned long long int *send, unsigned long long int *recv,
const int n, int *comm_rank_of_max) const {
const unsigned long long int *send, unsigned long long int *recv, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce1<long int>", profile_level);
if (comm_rank_of_max == nullptr) {
auto x = new long long int[n];
@@ -2101,7 +2089,7 @@ void MPI_CLASS::call_maxReduce<unsigned long long int>(
}
template <>
void MPI_CLASS::call_maxReduce<unsigned long long int>(
unsigned long long int *x, const int n, int *comm_rank_of_max) const {
unsigned long long int *x, int n, int *comm_rank_of_max) const {
auto recv = new unsigned long long int[n];
call_maxReduce<unsigned long long int>(x, recv, n, comm_rank_of_max);
for (int i = 0; i < n; i++)
@@ -2111,7 +2099,7 @@ void MPI_CLASS::call_maxReduce<unsigned long long int>(
// long long int
template <>
void MPI_CLASS::call_maxReduce<long long int>(const long long int *x,
long long int *y, const int n,
long long int *y, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce1<long int>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -2130,7 +2118,7 @@ void MPI_CLASS::call_maxReduce<long long int>(const long long int *x,
PROFILE_STOP("maxReduce1<long int>", profile_level);
}
template <>
void MPI_CLASS::call_maxReduce<long long int>(long long int *x, const int n,
void MPI_CLASS::call_maxReduce<long long int>(long long int *x, int n,
int *comm_rank_of_max) const {
auto recv = new long long int[n];
call_maxReduce<long long int>(x, recv, n, comm_rank_of_max);
@@ -2140,7 +2128,7 @@ void MPI_CLASS::call_maxReduce<long long int>(long long int *x, const int n,
}
// float
template <>
void MPI_CLASS::call_maxReduce<float>(const float *x, float *y, const int n,
void MPI_CLASS::call_maxReduce<float>(const float *x, float *y, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce1<float>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -2164,7 +2152,7 @@ void MPI_CLASS::call_maxReduce<float>(const float *x, float *y, const int n,
PROFILE_STOP("maxReduce1<float>", profile_level);
}
template <>
void MPI_CLASS::call_maxReduce<float>(float *x, const int n,
void MPI_CLASS::call_maxReduce<float>(float *x, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce2<float>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -2193,7 +2181,7 @@ void MPI_CLASS::call_maxReduce<float>(float *x, const int n,
}
// double
template <>
void MPI_CLASS::call_maxReduce<double>(const double *x, double *y, const int n,
void MPI_CLASS::call_maxReduce<double>(const double *x, double *y, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce1<double>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -2217,7 +2205,7 @@ void MPI_CLASS::call_maxReduce<double>(const double *x, double *y, const int n,
PROFILE_STOP("maxReduce1<double>", profile_level);
}
template <>
void MPI_CLASS::call_maxReduce<double>(double *x, const int n,
void MPI_CLASS::call_maxReduce<double>(double *x, int n,
int *comm_rank_of_max) const {
PROFILE_START("maxReduce2<double>", profile_level);
if (comm_rank_of_max == nullptr) {
@@ -2253,51 +2241,46 @@ void MPI_CLASS::call_maxReduce<double>(double *x, const int n,
#ifdef USE_MPI
// char
template <>
void MPI_CLASS::call_bcast<unsigned char>(unsigned char *x, const int n,
const int root) const {
void MPI_CLASS::call_bcast<unsigned char>(unsigned char *x, int n,
int root) const {
PROFILE_START("bcast<unsigned char>", profile_level);
MPI_Bcast(x, n, MPI_UNSIGNED_CHAR, root, communicator);
PROFILE_STOP("bcast<unsigned char>", profile_level);
}
template <>
void MPI_CLASS::call_bcast<char>(char *x, const int n, const int root) const {
template <> void MPI_CLASS::call_bcast<char>(char *x, int n, int root) const {
PROFILE_START("bcast<char>", profile_level);
MPI_Bcast(x, n, MPI_CHAR, root, communicator);
PROFILE_STOP("bcast<char>", profile_level);
}
// int
template <>
void MPI_CLASS::call_bcast<unsigned int>(unsigned int *x, const int n,
const int root) const {
void MPI_CLASS::call_bcast<unsigned int>(unsigned int *x, int n,
int root) const {
PROFILE_START("bcast<unsigned int>", profile_level);
MPI_Bcast(x, n, MPI_UNSIGNED, root, communicator);
PROFILE_STOP("bcast<unsigned int>", profile_level);
}
template <>
void MPI_CLASS::call_bcast<int>(int *x, const int n, const int root) const {
template <> void MPI_CLASS::call_bcast<int>(int *x, int n, int root) const {
PROFILE_START("bcast<int>", profile_level);
MPI_Bcast(x, n, MPI_INT, root, communicator);
PROFILE_STOP("bcast<int>", profile_level);
}
// float
template <>
void MPI_CLASS::call_bcast<float>(float *x, const int n, const int root) const {
template <> void MPI_CLASS::call_bcast<float>(float *x, int n, int root) const {
PROFILE_START("bcast<float>", profile_level);
MPI_Bcast(x, n, MPI_FLOAT, root, communicator);
PROFILE_STOP("bcast<float>", profile_level);
}
// double
template <>
void MPI_CLASS::call_bcast<double>(double *x, const int n,
const int root) const {
void MPI_CLASS::call_bcast<double>(double *x, int n, int root) const {
PROFILE_START("bcast<double>", profile_level);
MPI_Bcast(x, n, MPI_DOUBLE, root, communicator);
PROFILE_STOP("bcast<double>", profile_level);
}
#else
// We need a concrete instantiation of bcast<char>(x,n,root);
template <>
void MPI_CLASS::call_bcast<char>(char *, const int, const int) const {}
template <> void MPI_CLASS::call_bcast<char>(char *, int, int) const {}
#endif
/************************************************************************
@@ -2316,8 +2299,8 @@ void MPI_CLASS::barrier() const {
#ifdef USE_MPI
// char
template <>
void MPI_CLASS::send<char>(const char *buf, const int length,
const int recv_proc_number, int tag) const {
void MPI_CLASS::send<char>(const char *buf, int length, int recv_proc_number,
int tag) const {
// Set the tag to 0 if it is < 0
tag = (tag >= 0) ? tag : 0;
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
@@ -2329,8 +2312,8 @@ void MPI_CLASS::send<char>(const char *buf, const int length,
}
// int
template <>
void MPI_CLASS::send<int>(const int *buf, const int length,
const int recv_proc_number, int tag) const {
void MPI_CLASS::send<int>(const int *buf, int length, int recv_proc_number,
int tag) const {
// Set the tag to 0 if it is < 0
tag = (tag >= 0) ? tag : 0;
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
@@ -2341,8 +2324,8 @@ void MPI_CLASS::send<int>(const int *buf, const int length,
}
// float
template <>
void MPI_CLASS::send<float>(const float *buf, const int length,
const int recv_proc_number, int tag) const {
void MPI_CLASS::send<float>(const float *buf, int length, int recv_proc_number,
int tag) const {
// Set the tag to 0 if it is < 0
tag = (tag >= 0) ? tag : 0;
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
@@ -2354,8 +2337,8 @@ void MPI_CLASS::send<float>(const float *buf, const int length,
}
// double
template <>
void MPI_CLASS::send<double>(const double *buf, const int length,
const int recv_proc_number, int tag) const {
void MPI_CLASS::send<double>(const double *buf, int length,
int recv_proc_number, int tag) const {
// Set the tag to 0 if it is < 0
tag = (tag >= 0) ? tag : 0;
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
@@ -2368,8 +2351,7 @@ void MPI_CLASS::send<double>(const double *buf, const int length,
#else
// We need a concrete instantiation of send for use without MPI
template <>
void MPI_CLASS::send<char>(const char *buf, const int length, const int,
int tag) const {
void MPI_CLASS::send<char>(const char *buf, int length, int, int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
PROFILE_START("send<char>", profile_level);
@@ -2391,8 +2373,8 @@ void MPI_CLASS::send<char>(const char *buf, const int length, const int,
#ifdef USE_MPI
// char
template <>
MPI_Request MPI_CLASS::Isend<char>(const char *buf, const int length,
const int recv_proc, const int tag) const {
MPI_Request MPI_CLASS::Isend<char>(const char *buf, int length, int recv_proc,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
MPI_Request request;
@@ -2404,8 +2386,8 @@ MPI_Request MPI_CLASS::Isend<char>(const char *buf, const int length,
}
// int
template <>
MPI_Request MPI_CLASS::Isend<int>(const int *buf, const int length,
const int recv_proc, const int tag) const {
MPI_Request MPI_CLASS::Isend<int>(const int *buf, int length, int recv_proc,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
MPI_Request request;
@@ -2417,8 +2399,8 @@ MPI_Request MPI_CLASS::Isend<int>(const int *buf, const int length,
}
// float
template <>
MPI_Request MPI_CLASS::Isend<float>(const float *buf, const int length,
const int recv_proc, const int tag) const {
MPI_Request MPI_CLASS::Isend<float>(const float *buf, int length, int recv_proc,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
MPI_Request request;
@@ -2430,8 +2412,8 @@ MPI_Request MPI_CLASS::Isend<float>(const float *buf, const int length,
}
// double
template <>
MPI_Request MPI_CLASS::Isend<double>(const double *buf, const int length,
const int recv_proc, const int tag) const {
MPI_Request MPI_CLASS::Isend<double>(const double *buf, int length,
int recv_proc, int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
MPI_Request request;
@@ -2444,8 +2426,8 @@ MPI_Request MPI_CLASS::Isend<double>(const double *buf, const int length,
#else
// We need a concrete instantiation of send for use without mpi
template <>
MPI_Request MPI_CLASS::Isend<char>(const char *buf, const int length, const int,
const int tag) const {
MPI_Request MPI_CLASS::Isend<char>(const char *buf, int length, int,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
PROFILE_START("Isend<char>", profile_level);
@@ -2472,8 +2454,8 @@ MPI_Request MPI_CLASS::Isend<char>(const char *buf, const int length, const int,
/************************************************************************
* Send byte array to another processor. *
************************************************************************/
void MPI_CLASS::sendBytes(const void *buf, const int number_bytes,
const int recv_proc_number, int tag) const {
void MPI_CLASS::sendBytes(const void *buf, int number_bytes,
int recv_proc_number, int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
send<char>((const char *)buf, number_bytes, recv_proc_number, tag);
@@ -2482,7 +2464,7 @@ void MPI_CLASS::sendBytes(const void *buf, const int number_bytes,
/************************************************************************
* Non-blocking send byte array to another processor. *
************************************************************************/
MPI_Request MPI_CLASS::IsendBytes(const void *buf, const int number_bytes,
MPI_Request MPI_CLASS::IsendBytes(const void *buf, int number_bytes,
const int recv_proc, const int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
@@ -2496,7 +2478,7 @@ MPI_Request MPI_CLASS::IsendBytes(const void *buf, const int number_bytes,
#ifdef USE_MPI
// char
template <>
void MPI_CLASS::recv<char>(char *buf, int &length, const int send_proc_number,
void MPI_CLASS::recv<char>(char *buf, int &length, int send_proc_number,
const bool get_length, int tag) const {
// Set the tag to 0 if it is < 0
tag = (tag >= 0) ? tag : 0;
@@ -2518,7 +2500,7 @@ void MPI_CLASS::recv<char>(char *buf, int &length, const int send_proc_number,
}
// int
template <>
void MPI_CLASS::recv<int>(int *buf, int &length, const int send_proc_number,
void MPI_CLASS::recv<int>(int *buf, int &length, int send_proc_number,
const bool get_length, int tag) const {
// Set the tag to 0 if it is < 0
tag = (tag >= 0) ? tag : 0;
@@ -2540,7 +2522,7 @@ void MPI_CLASS::recv<int>(int *buf, int &length, const int send_proc_number,
}
// float
template <>
void MPI_CLASS::recv<float>(float *buf, int &length, const int send_proc_number,
void MPI_CLASS::recv<float>(float *buf, int &length, int send_proc_number,
const bool get_length, int tag) const {
// Set the tag to 0 if it is < 0
tag = (tag >= 0) ? tag : 0;
@@ -2562,9 +2544,8 @@ void MPI_CLASS::recv<float>(float *buf, int &length, const int send_proc_number,
}
// double
template <>
void MPI_CLASS::recv<double>(double *buf, int &length,
const int send_proc_number, const bool get_length,
int tag) const {
void MPI_CLASS::recv<double>(double *buf, int &length, int send_proc_number,
const bool get_length, int tag) const {
// Set the tag to 0 if it is < 0
tag = (tag >= 0) ? tag : 0;
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
@@ -2586,7 +2567,7 @@ void MPI_CLASS::recv<double>(double *buf, int &length,
#else
// We need a concrete instantiation of recv for use without mpi
template <>
void MPI_CLASS::recv<char>(char *buf, int &length, const int, const bool,
void MPI_CLASS::recv<char>(char *buf, int &length, int, const bool,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
@@ -2609,8 +2590,8 @@ void MPI_CLASS::recv<char>(char *buf, int &length, const int, const bool,
#ifdef USE_MPI
// char
template <>
MPI_Request MPI_CLASS::Irecv<char>(char *buf, const int length,
const int send_proc, const int tag) const {
MPI_Request MPI_CLASS::Irecv<char>(char *buf, int length, int send_proc,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
MPI_Request request;
@@ -2622,8 +2603,8 @@ MPI_Request MPI_CLASS::Irecv<char>(char *buf, const int length,
}
// int
template <>
MPI_Request MPI_CLASS::Irecv<int>(int *buf, const int length,
const int send_proc, const int tag) const {
MPI_Request MPI_CLASS::Irecv<int>(int *buf, int length, int send_proc,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
MPI_Request request;
@@ -2635,8 +2616,8 @@ MPI_Request MPI_CLASS::Irecv<int>(int *buf, const int length,
}
// float
template <>
MPI_Request MPI_CLASS::Irecv<float>(float *buf, const int length,
const int send_proc, const int tag) const {
MPI_Request MPI_CLASS::Irecv<float>(float *buf, int length, int send_proc,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
MPI_Request request;
@@ -2648,8 +2629,8 @@ MPI_Request MPI_CLASS::Irecv<float>(float *buf, const int length,
}
// double
template <>
MPI_Request MPI_CLASS::Irecv<double>(double *buf, const int length,
const int send_proc, const int tag) const {
MPI_Request MPI_CLASS::Irecv<double>(double *buf, int length, int send_proc,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
MPI_Request request;
@@ -2662,8 +2643,7 @@ MPI_Request MPI_CLASS::Irecv<double>(double *buf, const int length,
#else
// We need a concrete instantiation of irecv for use without mpi
template <>
MPI_Request MPI_CLASS::Irecv<char>(char *buf, const int length, const int,
const int tag) const {
MPI_Request MPI_CLASS::Irecv<char>(char *buf, int length, int, int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
PROFILE_START("Irecv<char>", profile_level);
@@ -2690,7 +2670,7 @@ MPI_Request MPI_CLASS::Irecv<char>(char *buf, const int length, const int,
/************************************************************************
* Recieve byte array to another processor. *
************************************************************************/
void MPI_CLASS::recvBytes(void *buf, int &number_bytes, const int send_proc,
void MPI_CLASS::recvBytes(void *buf, int &number_bytes, int send_proc,
int tag) const {
recv<char>((char *)buf, number_bytes, send_proc, false, tag);
}
@@ -2698,8 +2678,8 @@ void MPI_CLASS::recvBytes(void *buf, int &number_bytes, const int send_proc,
/************************************************************************
* Recieve byte array to another processor. *
************************************************************************/
MPI_Request MPI_CLASS::IrecvBytes(void *buf, const int number_bytes,
const int send_proc, const int tag) const {
MPI_Request MPI_CLASS::IrecvBytes(void *buf, int number_bytes, int send_proc,
int tag) const {
MPI_INSIST(tag <= d_maxTag, "Maximum tag value exceeded");
MPI_INSIST(tag >= 0, "tag must be >= 0");
return Irecv<char>((char *)buf, number_bytes, send_proc, tag);
@@ -2913,7 +2893,7 @@ void MPI_CLASS::call_allGather<char>(const char *, int, char *, int *,
************************************************************************/
#ifdef USE_MPI
template <>
void MPI_CLASS::allToAll<unsigned char>(const int n, const unsigned char *send,
void MPI_CLASS::allToAll<unsigned char>(int n, const unsigned char *send,
unsigned char *recv) const {
PROFILE_START("allToAll<unsigned char>", profile_level);
MPI_Alltoall((void *)send, n, MPI_UNSIGNED_CHAR, (void *)recv, n,
@@ -2921,15 +2901,14 @@ void MPI_CLASS::allToAll<unsigned char>(const int n, const unsigned char *send,
PROFILE_STOP("allToAll<unsigned char>", profile_level);
}
template <>
void MPI_CLASS::allToAll<char>(const int n, const char *send,
char *recv) const {
void MPI_CLASS::allToAll<char>(int n, const char *send, char *recv) const {
PROFILE_START("allToAll<char>", profile_level);
MPI_Alltoall((void *)send, n, MPI_CHAR, (void *)recv, n, MPI_CHAR,
communicator);
PROFILE_STOP("allToAll<char>", profile_level);
}
template <>
void MPI_CLASS::allToAll<unsigned int>(const int n, const unsigned int *send,
void MPI_CLASS::allToAll<unsigned int>(int n, const unsigned int *send,
unsigned int *recv) const {
PROFILE_START("allToAll<unsigned int>", profile_level);
MPI_Alltoall((void *)send, n, MPI_UNSIGNED, (void *)recv, n, MPI_UNSIGNED,
@@ -2937,14 +2916,14 @@ void MPI_CLASS::allToAll<unsigned int>(const int n, const unsigned int *send,
PROFILE_STOP("allToAll<unsigned int>", profile_level);
}
template <>
void MPI_CLASS::allToAll<int>(const int n, const int *send, int *recv) const {
void MPI_CLASS::allToAll<int>(int n, const int *send, int *recv) const {
PROFILE_START("allToAll<int>", profile_level);
MPI_Alltoall((void *)send, n, MPI_INT, (void *)recv, n, MPI_INT,
communicator);
PROFILE_STOP("allToAll<int>", profile_level);
}
template <>
void MPI_CLASS::allToAll<unsigned long int>(const int n,
void MPI_CLASS::allToAll<unsigned long int>(int n,
const unsigned long int *send,
unsigned long int *recv) const {
PROFILE_START("allToAll<unsigned long>", profile_level);
@@ -2953,7 +2932,7 @@ void MPI_CLASS::allToAll<unsigned long int>(const int n,
PROFILE_STOP("allToAll<unsigned long>", profile_level);
}
template <>
void MPI_CLASS::allToAll<long int>(const int n, const long int *send,
void MPI_CLASS::allToAll<long int>(int n, const long int *send,
long int *recv) const {
PROFILE_START("allToAll<long int>", profile_level);
MPI_Alltoall((void *)send, n, MPI_LONG, (void *)recv, n, MPI_LONG,
@@ -2961,15 +2940,14 @@ void MPI_CLASS::allToAll<long int>(const int n, const long int *send,
PROFILE_STOP("allToAll<long int>", profile_level);
}
template <>
void MPI_CLASS::allToAll<float>(const int n, const float *send,
float *recv) const {
void MPI_CLASS::allToAll<float>(int n, const float *send, float *recv) const {
PROFILE_START("allToAll<float>", profile_level);
MPI_Alltoall((void *)send, n, MPI_FLOAT, (void *)recv, n, MPI_FLOAT,
communicator);
PROFILE_STOP("allToAll<float>", profile_level);
}
template <>
void MPI_CLASS::allToAll<double>(const int n, const double *send,
void MPI_CLASS::allToAll<double>(int n, const double *send,
double *recv) const {
PROFILE_START("allToAll<double>", profile_level);
MPI_Alltoall((void *)send, n, MPI_DOUBLE, (void *)recv, n, MPI_DOUBLE,
@@ -3713,4 +3691,28 @@ MPI MPI::loadBalance(double local, std::vector<double> work) {
return split(0, key[getRank()]);
}
/****************************************************************************
* Function Persistent Communication *
****************************************************************************/
template <>
std::shared_ptr<MPI_Request> MPI::Isend_init<double>(const double *buf, int N, int proc, int tag) const
{
std::shared_ptr<MPI_Request> obj( new MPI_Request, []( MPI_Request *req ) { MPI_Request_free( req ); delete req; } );
MPI_Send_init( buf, N, MPI_DOUBLE, proc, tag, communicator, obj.get() );
return obj;
}
template<>
std::shared_ptr<MPI_Request> MPI::Irecv_init<double>(double *buf, int N, int proc, int tag) const
{
std::shared_ptr<MPI_Request> obj( new MPI_Request, []( MPI_Request *req ) { MPI_Request_free( req ); delete req; } );
MPI_Recv_init( buf, N, MPI_DOUBLE, proc, tag, communicator, obj.get() );
return obj;
}
void MPI::Start( MPI_Request &request )
{
MPI_Start( &request );
}
} // namespace Utilities

View File

@@ -26,6 +26,7 @@ redistribution is prohibited.
#include <atomic>
#include <complex>
#include <map>
#include <memory>
#include <set>
#include <string>
#include <vector>
@@ -173,10 +174,9 @@ public: // Member functions
*
*/
static void
balanceProcesses(const MPI &comm = MPI(MPI_COMM_WORLD),
const int method = 1,
balanceProcesses(const MPI &comm = MPI(MPI_COMM_WORLD), int method = 1,
const std::vector<int> &procs = std::vector<int>(),
const int N_min = 1, const int N_max = -1);
int N_min = 1, int N_max = -1);
//! Query the level of thread support
static ThreadSupport queryThreadSupport();
@@ -420,7 +420,7 @@ public: // Member functions
* \param x The input/output array for the reduce
* \param n The number of values in the array (must match on all nodes)
*/
template <class type> void sumReduce(type *x, const int n = 1) const;
template <class type> void sumReduce(type *x, int n = 1) const;
/**
* \brief Sum Reduce
@@ -432,7 +432,7 @@ public: // Member functions
* \param n The number of values in the array (must match on all nodes)
*/
template <class type>
void sumReduce(const type *x, type *y, const int n = 1) const;
void sumReduce(const type *x, type *y, int n = 1) const;
/**
* \brief Min Reduce
@@ -457,7 +457,7 @@ public: // Member functions
* minimum value
*/
template <class type>
void minReduce(type *x, const int n = 1, int *rank_of_min = nullptr) const;
void minReduce(type *x, int n = 1, int *rank_of_min = nullptr) const;
/**
* \brief Sum Reduce
@@ -475,7 +475,7 @@ public: // Member functions
* minimum value
*/
template <class type>
void minReduce(const type *x, type *y, const int n = 1,
void minReduce(const type *x, type *y, int n = 1,
int *rank_of_min = nullptr) const;
/**
@@ -501,7 +501,7 @@ public: // Member functions
* minimum value
*/
template <class type>
void maxReduce(type *x, const int n = 1, int *rank_of_max = nullptr) const;
void maxReduce(type *x, int n = 1, int *rank_of_max = nullptr) const;
/**
* \brief Sum Reduce
@@ -519,7 +519,7 @@ public: // Member functions
* minimum value
*/
template <class type>
void maxReduce(const type *x, type *y, const int n = 1,
void maxReduce(const type *x, type *y, int n = 1,
int *rank_of_max = nullptr) const;
/**
@@ -530,8 +530,7 @@ public: // Member functions
* \param y The output array for the scan
* \param n The number of values in the array (must match on all nodes)
*/
template <class type>
void sumScan(const type *x, type *y, const int n = 1) const;
template <class type> void sumScan(const type *x, type *y, int n = 1) const;
/**
* \brief Scan Min Reduce
@@ -541,8 +540,7 @@ public: // Member functions
* \param y The output array for the scan
* \param n The number of values in the array (must match on all nodes)
*/
template <class type>
void minScan(const type *x, type *y, const int n = 1) const;
template <class type> void minScan(const type *x, type *y, int n = 1) const;
/**
* \brief Scan Max Reduce
@@ -552,8 +550,7 @@ public: // Member functions
* \param y The output array for the scan
* \param n The number of values in the array (must match on all nodes)
*/
template <class type>
void maxScan(const type *x, type *y, const int n = 1) const;
template <class type> void maxScan(const type *x, type *y, int n = 1) const;
/**
* \brief Broadcast
@@ -561,7 +558,7 @@ public: // Member functions
* \param value The input value for the broadcast.
* \param root The processor performing the broadcast
*/
template <class type> type bcast(const type &value, const int root) const;
template <class type> type bcast(const type &value, int root) const;
/**
* \brief Broadcast
@@ -570,8 +567,7 @@ public: // Member functions
* \param n The number of values in the array (must match on all nodes)
* \param root The processor performing the broadcast
*/
template <class type>
void bcast(type *value, const int n, const int root) const;
template <class type> void bcast(type *value, int n, int root) const;
/**
* Perform a global barrier across all processors.
@@ -595,8 +591,7 @@ public: // Member functions
* The matching recv must share this tag.
*/
template <class type>
void send(const type *buf, const int length, const int recv,
int tag = 0) const;
void send(const type *buf, int length, int recv, int tag = 0) const;
/*!
* @brief This function sends an MPI message with an array of bytes
@@ -611,8 +606,7 @@ public: // Member functions
* to be sent with this message. Default tag is 0.
* The matching recv must share this tag.
*/
void sendBytes(const void *buf, const int N_bytes, const int recv,
int tag = 0) const;
void sendBytes(const void *buf, int N_bytes, int recv, int tag = 0) const;
/*!
* @brief This function sends an MPI message with an array
@@ -627,8 +621,8 @@ public: // Member functions
* to be sent with this message.
*/
template <class type>
MPI_Request Isend(const type *buf, const int length, const int recv_proc,
const int tag) const;
MPI_Request Isend(const type *buf, int length, int recv_proc,
int tag) const;
/*!
* @brief This function sends an MPI message with an array of bytes
@@ -642,8 +636,8 @@ public: // Member functions
* @param tag Integer argument specifying an integer tag
* to be sent with this message.
*/
MPI_Request IsendBytes(const void *buf, const int N_bytes,
const int recv_proc, const int tag) const;
MPI_Request IsendBytes(const void *buf, int N_bytes, int recv_proc,
int tag) const;
/*!
* @brief This function receives an MPI message with a data
@@ -662,7 +656,7 @@ public: // Member functions
* by the tag of the incoming message. Default tag is 0.
*/
template <class type>
inline void recv(type *buf, int length, const int send, int tag) const {
inline void recv(type *buf, int length, int send, int tag) const {
int length2 = length;
recv(buf, length2, send, false, tag);
}
@@ -687,7 +681,7 @@ public: // Member functions
* by the tag of the incoming message. Default tag is 0.
*/
template <class type>
void recv(type *buf, int &length, const int send, const bool get_length,
void recv(type *buf, int &length, int send, const bool get_length,
int tag) const;
/*!
@@ -703,7 +697,7 @@ public: // Member functions
* must be matched by the tag of the incoming message. Default
* tag is 0.
*/
void recvBytes(void *buf, int &N_bytes, const int send, int tag = 0) const;
void recvBytes(void *buf, int &N_bytes, int send, int tag = 0) const;
/*!
* @brief This function receives an MPI message with a data
@@ -716,8 +710,7 @@ public: // Member functions
* be matched by the tag of the incoming message.
*/
template <class type>
MPI_Request Irecv(type *buf, const int length, const int send_proc,
const int tag) const;
MPI_Request Irecv(type *buf, int length, int send_proc, int tag) const;
/*!
* @brief This function receives an MPI message with an array of
@@ -731,8 +724,8 @@ public: // Member functions
* @param tag Integer argument specifying a tag which must
* be matched by the tag of the incoming message.
*/
MPI_Request IrecvBytes(void *buf, const int N_bytes, const int send_proc,
const int tag) const;
MPI_Request IrecvBytes(void *buf, int N_bytes, int send_proc,
int tag) const;
/*!
* @brief This function sends and recieves data using a blocking call
@@ -741,6 +734,39 @@ public: // Member functions
void sendrecv(const type *sendbuf, int sendcount, int dest, int sendtag,
type *recvbuf, int recvcount, int source, int recvtag) const;
/*!
* @brief This function sets up an Isend call (see MPI_Send_init)
* @param buf Pointer to array buffer with length integers.
* @param length Number of integers in buf that we want to send.
* @param recv_proc Receiving processor number.
* @param tag Tag to send
* @return Returns an MPI_Request.
* Note this returns a unique pointer so the user does not
* need to manually free the request
*/
template <class type>
std::shared_ptr<MPI_Request> Isend_init(const type *buf, int length, int recv_proc,
int tag) const;
/*!
* @brief This function sets up an Irecv call (see MPI_Recv_init)
* @param buf Pointer to integer array buffer with capacity of length integers.
* @param length Maximum number of values that can be stored in buf.
* @param send_proc Processor number of sender.
* @param tag Tag to match
* @return Returns an MPI_Request.
* Note this returns a unique pointer so the user does not
* need to manually free the request
*/
template <class type>
std::shared_ptr<MPI_Request> Irecv_init(type *buf, int length, int send_proc, int tag) const;
/*!
* @brief Start the MPI communication
* @param request Request to start
*/
void Start( MPI_Request &request );
/*!
* Each processor sends every other processor a single value.
* @param[in] x Input value for allGather
@@ -792,7 +818,7 @@ public: // Member functions
* and the sizes and displacements will be returned (if desired).
*/
template <class type>
int allGather(const type *send_data, const int send_cnt, type *recv_data,
int allGather(const type *send_data, int send_cnt, type *recv_data,
int *recv_cnt = nullptr, int *recv_disp = nullptr,
bool known_recv = false) const;
@@ -822,7 +848,7 @@ public: // Member functions
* @param recv_data Output array of received values (nxN)
*/
template <class type>
void allToAll(const int n, const type *send_data, type *recv_data) const;
void allToAll(int n, const type *send_data, type *recv_data) const;
/*!
* Each processor sends an array of data to the different processors.
@@ -995,23 +1021,20 @@ public: // Member functions
MPI loadBalance(double localPerformance, std::vector<double> work);
private: // Private helper functions for templated MPI operations;
template <class type> void call_sumReduce(type *x, const int n = 1) const;
template <class type> void call_sumReduce(type *x, int n = 1) const;
template <class type>
void call_sumReduce(const type *x, type *y, const int n = 1) const;
void call_sumReduce(const type *x, type *y, int n = 1) const;
template <class type>
void call_minReduce(type *x, const int n = 1,
void call_minReduce(type *x, int n = 1, int *rank_of_min = nullptr) const;
template <class type>
void call_minReduce(const type *x, type *y, int n = 1,
int *rank_of_min = nullptr) const;
template <class type>
void call_minReduce(const type *x, type *y, const int n = 1,
int *rank_of_min = nullptr) const;
void call_maxReduce(type *x, int n = 1, int *rank_of_max = nullptr) const;
template <class type>
void call_maxReduce(type *x, const int n = 1,
void call_maxReduce(const type *x, type *y, int n = 1,
int *rank_of_max = nullptr) const;
template <class type>
void call_maxReduce(const type *x, type *y, const int n = 1,
int *rank_of_max = nullptr) const;
template <class type>
void call_bcast(type *x, const int n, const int root) const;
template <class type> void call_bcast(type *x, int n, int root) const;
template <class type>
void call_allGather(const type &x_in, type *x_out) const;
template <class type>

918
common/Membrane.cpp Normal file
View File

@@ -0,0 +1,918 @@
/* Membrane class for lattice Boltzmann models */
#include "common/Membrane.h"
#include "analysis/distance.h"
Membrane::Membrane(std::shared_ptr <ScaLBL_Communicator> sComm, int *dvcNeighborList, int Nsites) {
Np = Nsites;
initialNeighborList = new int[18*Np];
ScaLBL_AllocateDeviceMemory((void **)&NeighborList, 18*Np*sizeof(int));
Lock=false; // unlock the communicator
//......................................................................................
// Create a separate copy of the communicator for the device
MPI_COMM_SCALBL = sComm->MPI_COMM_SCALBL.dup();
ScaLBL_CopyToHost(initialNeighborList, dvcNeighborList, 18*Np*sizeof(int));
sComm->MPI_COMM_SCALBL.barrier();
ScaLBL_CopyToDevice(NeighborList, initialNeighborList, 18*Np*sizeof(int));
/* Copy communication lists */
//......................................................................................
//Lock=false; // unlock the communicator
//......................................................................................
// Create a separate copy of the communicator for the device
//MPI_COMM_SCALBL = sComm->Comm.dup();
//......................................................................................
// Copy the domain size and communication information directly from sComm
Nx = sComm->Nx;
Ny = sComm->Ny;
Nz = sComm->Nz;
N = Nx*Ny*Nz;
//next=0;
rank=sComm->rank;
rank_x=sComm->rank_x;
rank_y=sComm->rank_y;
rank_z=sComm->rank_z;
rank_X=sComm->rank_X;
rank_Y=sComm->rank_Y;
rank_Z=sComm->rank_Z;
if (rank == 0){
printf("**** Creating membrane data structure ****** \n");
printf(" Number of active lattice sites (rank = %i): %i \n",rank, Np);
}
sendCount_x=sComm->sendCount_x;
sendCount_y=sComm->sendCount_y;
sendCount_z=sComm->sendCount_z;
sendCount_X=sComm->sendCount_X;
sendCount_Y=sComm->sendCount_Y;
sendCount_Z=sComm->sendCount_Z;
recvCount_x=sComm->recvCount_x;
recvCount_y=sComm->recvCount_y;
recvCount_z=sComm->recvCount_z;
recvCount_X=sComm->recvCount_X;
recvCount_Y=sComm->recvCount_Y;
recvCount_Z=sComm->recvCount_Z;
ScaLBL_AllocateZeroCopy((void **) &dvcSendList_x, recvCount_x*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcSendList_y, recvCount_y*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcSendList_z, recvCount_z*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcSendList_X, recvCount_X*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcSendList_Y, recvCount_Y*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcSendList_Z, recvCount_Z*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvLinks_x, recvCount_x*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvLinks_y, recvCount_y*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvLinks_z, recvCount_z*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvLinks_X, recvCount_X*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvLinks_Y, recvCount_Y*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvLinks_Z, recvCount_Z*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvDist_x, recvCount_x*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvDist_y, recvCount_y*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvDist_z, recvCount_z*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvDist_X, recvCount_X*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvDist_Y, recvCount_Y*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &dvcRecvDist_Z, recvCount_Z*sizeof(int));
ScaLBL_AllocateZeroCopy((void **) &sendbuf_x, sendCount_x*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &sendbuf_y, sendCount_y*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &sendbuf_z, sendCount_z*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &sendbuf_X, sendCount_X*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &sendbuf_Y, sendCount_Y*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &sendbuf_Z, sendCount_Z*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &recvbuf_x, recvCount_x*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &recvbuf_y, recvCount_y*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &recvbuf_z, recvCount_z*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &recvbuf_X, recvCount_X*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &recvbuf_Y, recvCount_Y*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &recvbuf_Z, recvCount_Z*sizeof(double));
sendCount_x=sComm->copySendList("x", dvcSendList_x);
sendCount_y=sComm->copySendList("y", dvcSendList_y);
sendCount_z=sComm->copySendList("z", dvcSendList_z);
sendCount_X=sComm->copySendList("X", dvcSendList_X);
sendCount_Y=sComm->copySendList("Y", dvcSendList_Y);
sendCount_Z=sComm->copySendList("Z", dvcSendList_Z);
recvCount_x=sComm->copyRecvList("x", dvcRecvDist_x);
recvCount_y=sComm->copyRecvList("y", dvcRecvDist_y);
recvCount_z=sComm->copyRecvList("z", dvcRecvDist_z);
recvCount_X=sComm->copyRecvList("X", dvcRecvDist_X);
recvCount_Y=sComm->copyRecvList("Y", dvcRecvDist_Y);
recvCount_Z=sComm->copyRecvList("Z", dvcRecvDist_Z);
}
Membrane::~Membrane() {
delete [] initialNeighborList;
delete [] membraneLinks;
delete [] membraneTag;
delete [] membraneDist;
ScaLBL_FreeDeviceMemory( coefficient_x );
ScaLBL_FreeDeviceMemory( coefficient_X );
ScaLBL_FreeDeviceMemory( coefficient_y );
ScaLBL_FreeDeviceMemory( coefficient_Y );
ScaLBL_FreeDeviceMemory( coefficient_z );
ScaLBL_FreeDeviceMemory( coefficient_Z );
ScaLBL_FreeDeviceMemory( NeighborList );
ScaLBL_FreeDeviceMemory( MembraneLinks );
ScaLBL_FreeDeviceMemory( MembraneCoef );
ScaLBL_FreeDeviceMemory( MembraneDistance );
ScaLBL_FreeDeviceMemory( sendbuf_x );
ScaLBL_FreeDeviceMemory( sendbuf_X );
ScaLBL_FreeDeviceMemory( sendbuf_y );
ScaLBL_FreeDeviceMemory( sendbuf_Y );
ScaLBL_FreeDeviceMemory( sendbuf_z );
ScaLBL_FreeDeviceMemory( sendbuf_Z );
/* ScaLBL_FreeDeviceMemory( sendbuf_xy );
ScaLBL_FreeDeviceMemory( sendbuf_xY );
ScaLBL_FreeDeviceMemory( sendbuf_Xy );
ScaLBL_FreeDeviceMemory( sendbuf_XY );
ScaLBL_FreeDeviceMemory( sendbuf_xz );
ScaLBL_FreeDeviceMemory( sendbuf_xZ );
ScaLBL_FreeDeviceMemory( sendbuf_Xz );
ScaLBL_FreeDeviceMemory( sendbuf_XZ );
ScaLBL_FreeDeviceMemory( sendbuf_yz );
ScaLBL_FreeDeviceMemory( sendbuf_yZ );
ScaLBL_FreeDeviceMemory( sendbuf_Yz );
ScaLBL_FreeDeviceMemory( sendbuf_YZ );
*/
ScaLBL_FreeDeviceMemory( recvbuf_x );
ScaLBL_FreeDeviceMemory( recvbuf_X );
ScaLBL_FreeDeviceMemory( recvbuf_y );
ScaLBL_FreeDeviceMemory( recvbuf_Y );
ScaLBL_FreeDeviceMemory( recvbuf_z );
ScaLBL_FreeDeviceMemory( recvbuf_Z );
/*
ScaLBL_FreeDeviceMemory( recvbuf_xy );
ScaLBL_FreeDeviceMemory( recvbuf_xY );
ScaLBL_FreeDeviceMemory( recvbuf_Xy );
ScaLBL_FreeDeviceMemory( recvbuf_XY );
ScaLBL_FreeDeviceMemory( recvbuf_xz );
ScaLBL_FreeDeviceMemory( recvbuf_xZ );
ScaLBL_FreeDeviceMemory( recvbuf_Xz );
ScaLBL_FreeDeviceMemory( recvbuf_XZ );
ScaLBL_FreeDeviceMemory( recvbuf_yz );
ScaLBL_FreeDeviceMemory( recvbuf_yZ );
ScaLBL_FreeDeviceMemory( recvbuf_Yz );
ScaLBL_FreeDeviceMemory( recvbuf_YZ );
*/
ScaLBL_FreeDeviceMemory( dvcSendList_x );
ScaLBL_FreeDeviceMemory( dvcSendList_X );
ScaLBL_FreeDeviceMemory( dvcSendList_y );
ScaLBL_FreeDeviceMemory( dvcSendList_Y );
ScaLBL_FreeDeviceMemory( dvcSendList_z );
ScaLBL_FreeDeviceMemory( dvcSendList_Z );
/*
ScaLBL_FreeDeviceMemory( dvcSendList_xy );
ScaLBL_FreeDeviceMemory( dvcSendList_xY );
ScaLBL_FreeDeviceMemory( dvcSendList_Xy );
ScaLBL_FreeDeviceMemory( dvcSendList_XY );
ScaLBL_FreeDeviceMemory( dvcSendList_xz );
ScaLBL_FreeDeviceMemory( dvcSendList_xZ );
ScaLBL_FreeDeviceMemory( dvcSendList_Xz );
ScaLBL_FreeDeviceMemory( dvcSendList_XZ );
ScaLBL_FreeDeviceMemory( dvcSendList_yz );
ScaLBL_FreeDeviceMemory( dvcSendList_yZ );
ScaLBL_FreeDeviceMemory( dvcSendList_Yz );
ScaLBL_FreeDeviceMemory( dvcSendList_YZ );
ScaLBL_FreeDeviceMemory( dvcRecvList_x );
ScaLBL_FreeDeviceMemory( dvcRecvList_X );
ScaLBL_FreeDeviceMemory( dvcRecvList_y );
ScaLBL_FreeDeviceMemory( dvcRecvList_Y );
ScaLBL_FreeDeviceMemory( dvcRecvList_z );
ScaLBL_FreeDeviceMemory( dvcRecvList_Z );
ScaLBL_FreeDeviceMemory( dvcRecvList_xy );
ScaLBL_FreeDeviceMemory( dvcRecvList_xY );
ScaLBL_FreeDeviceMemory( dvcRecvList_Xy );
ScaLBL_FreeDeviceMemory( dvcRecvList_XY );
ScaLBL_FreeDeviceMemory( dvcRecvList_xz );
ScaLBL_FreeDeviceMemory( dvcRecvList_xZ );
ScaLBL_FreeDeviceMemory( dvcRecvList_Xz );
ScaLBL_FreeDeviceMemory( dvcRecvList_XZ );
ScaLBL_FreeDeviceMemory( dvcRecvList_yz );
ScaLBL_FreeDeviceMemory( dvcRecvList_yZ );
ScaLBL_FreeDeviceMemory( dvcRecvList_Yz );
ScaLBL_FreeDeviceMemory( dvcRecvList_YZ );
*/
ScaLBL_FreeDeviceMemory( dvcRecvLinks_x );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_X );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_y );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_Y );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_z );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_Z );
/*
ScaLBL_FreeDeviceMemory( dvcRecvLinks_xy );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_xY );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_Xy );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_XY );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_xz );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_xZ );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_Xz );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_XZ );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_yz );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_yZ );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_Yz );
ScaLBL_FreeDeviceMemory( dvcRecvLinks_YZ );
*/
ScaLBL_FreeDeviceMemory( dvcRecvDist_x );
ScaLBL_FreeDeviceMemory( dvcRecvDist_X );
ScaLBL_FreeDeviceMemory( dvcRecvDist_y );
ScaLBL_FreeDeviceMemory( dvcRecvDist_Y );
ScaLBL_FreeDeviceMemory( dvcRecvDist_z );
ScaLBL_FreeDeviceMemory( dvcRecvDist_Z );
/*
ScaLBL_FreeDeviceMemory( dvcRecvDist_xy );
ScaLBL_FreeDeviceMemory( dvcRecvDist_xY );
ScaLBL_FreeDeviceMemory( dvcRecvDist_Xy );
ScaLBL_FreeDeviceMemory( dvcRecvDist_XY );
ScaLBL_FreeDeviceMemory( dvcRecvDist_xz );
ScaLBL_FreeDeviceMemory( dvcRecvDist_xZ );
ScaLBL_FreeDeviceMemory( dvcRecvDist_Xz );
ScaLBL_FreeDeviceMemory( dvcRecvDist_XZ );
ScaLBL_FreeDeviceMemory( dvcRecvDist_yz );
ScaLBL_FreeDeviceMemory( dvcRecvDist_yZ );
ScaLBL_FreeDeviceMemory( dvcRecvDist_Yz );
ScaLBL_FreeDeviceMemory( dvcRecvDist_YZ );
*/
}
int Membrane::Create(DoubleArray &Distance, IntArray &Map){
int mlink = 0;
int i,j,k;
int idx, neighbor;
double dist, locdist;
if (rank == 0) printf(" Copy initial neighborlist... \n");
int * neighborList = new int[18*Np];
/* Copy neighborList */
for (int idx=0; idx<Np; idx++){
for (int q = 0; q<18; q++){
neighborList[q*Np+idx] = initialNeighborList[q*Np+idx];
}
}
int Q = 7; // for D3Q7 model
/* go through the neighborlist structure */
/* count & cut the links */
if (rank == 0) printf(" Cut membrane links... \n");
for (k=1;k<Nz-1;k++){
for (j=1;j<Ny-1;j++){
for (i=1;i<Nx-1;i++){
idx=Map(i,j,k);
locdist=Distance(i,j,k);
if (!(idx<0)){
neighbor=Map(i-1,j,k);
dist=Distance(i-1,j,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[idx]=idx + 2*Np;
}
neighbor=Map(i+1,j,k);
dist=Distance(i+1,j,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[Np+idx] = idx + 1*Np;
mlink++;
}
neighbor=Map(i,j-1,k);
dist=Distance(i,j-1,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[2*Np+idx]=idx + 4*Np;
}
neighbor=Map(i,j+1,k);
dist=Distance(i,j+1,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[3*Np+idx]=idx + 3*Np;
mlink++;
}
neighbor=Map(i,j,k-1);
dist=Distance(i,j,k-1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[4*Np+idx]=idx + 6*Np;
}
neighbor=Map(i,j,k+1);
dist=Distance(i,j,k+1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[5*Np+idx]=idx + 5*Np;
mlink++;
}
if (Q > 7){
neighbor=Map(i-1,j-1,k);
dist=Distance(i-1,j-1,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[6*Np+idx]=idx + 8*Np;
}
neighbor=Map(i+1,j+1,k);
dist=Distance(i+1,j+1,k);
if (dist*locdist < 0.0){
neighborList[7*Np+idx]=idx + 7*Np;
mlink++;
}
neighbor=Map(i-1,j+1,k);
dist=Distance(i-1,j+1,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[8*Np+idx]=idx + 10*Np;
}
neighbor=Map(i+1,j-1,k);
dist=Distance(i+1,j-1,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[9*Np+idx]=idx + 9*Np;
mlink++;
}
neighbor=Map(i-1,j,k-1);
dist=Distance(i-1,j,k-1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[10*Np+idx]=idx + 12*Np;
}
neighbor=Map(i+1,j,k+1);
dist=Distance(i+1,j,k+1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[11*Np+idx]=idx + 11*Np;
mlink++;
}
neighbor=Map(i-1,j,k+1);
dist=Distance(i-1,j,k+1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[12*Np+idx]=idx + 14*Np;
}
neighbor=Map(i+1,j,k-1);
dist=Distance(i+1,j,k-1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[13*Np+idx]=idx + 13*Np;
mlink++;
}
neighbor=Map(i,j-1,k-1);
dist=Distance(i,j-1,k-1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[14*Np+idx]=idx + 16*Np;
}
neighbor=Map(i,j+1,k+1);
dist=Distance(i,j+1,k+1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[15*Np+idx]=idx + 15*Np;
mlink++;
}
neighbor=Map(i,j-1,k+1);
dist=Distance(i,j-1,k+1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[16*Np+idx]=idx + 18*Np;
}
neighbor=Map(i,j+1,k-1);
dist=Distance(i,j+1,k-1);
if (dist*locdist < 0.0 && !(neighbor<0)){
neighborList[17*Np+idx]=idx + 17*Np;
mlink++;
}
}
}
}
}
}
/* allocate memory */
membraneTag = new int [mlink];
membraneLinks = new int [2*mlink];
membraneDist = new double [2*mlink];
membraneLinkCount = mlink;
if (rank == 0) printf(" (cut %i links crossing membrane) \n",mlink);
/* construct the membrane*/
/* *
* Sites inside the membrane (negative distance) -- store at 2*mlink
* Sites outside the membrane (positive distance) -- store at 2*mlink+1
*/
if (rank == 0) printf(" Construct membrane data structures... \n");
mlink = 0;
int localSite = 0; int neighborSite = 0;
for (k=1;k<Nz-1;k++){
for (j=1;j<Ny-1;j++){
for (i=1;i<Nx-1;i++){
idx=Map(i,j,k);
locdist=Distance(i,j,k);
if (!(idx<0)){
neighbor=Map(i+1,j,k);
dist=Distance(i+1,j,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0 ){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 1*Np;
membraneLinks[neighborSite] = neighbor + 2*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
neighbor=Map(i,j+1,k);
dist=Distance(i,j+1,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 3*Np;
membraneLinks[neighborSite] = neighbor + 4*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
neighbor=Map(i,j,k+1);
dist=Distance(i,j,k+1);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 5*Np;
membraneLinks[neighborSite] = neighbor + 6*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
if (Q > 7){
neighbor=Map(i+1,j+1,k);
dist=Distance(i+1,j+1,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 7*Np;
membraneLinks[neighborSite] = neighbor+8*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
neighbor=Map(i+1,j-1,k);
dist=Distance(i+1,j-1,k);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 9*Np;
membraneLinks[neighborSite] = neighbor + 10*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
neighbor=Map(i+1,j,k+1);
dist=Distance(i+1,j,k+1);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 11*Np;
membraneLinks[neighborSite] = neighbor + 12*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
neighbor=Map(i+1,j,k-1);
dist=Distance(i+1,j,k-1);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 13*Np;
membraneLinks[neighborSite] = neighbor + 14*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
neighbor=Map(i,j+1,k+1);
dist=Distance(i,j+1,k+1);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 15*Np;
membraneLinks[neighborSite] = neighbor + 16*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
neighbor=Map(i,j+1,k-1);
dist=Distance(i,j+1,k-1);
if (dist*locdist < 0.0 && !(neighbor<0)){
if (locdist < 0.0){
localSite = 2*mlink;
neighborSite = 2*mlink+1;
}
else{
localSite = 2*mlink+1;
neighborSite = 2*mlink;
}
membraneLinks[localSite] = idx + 17*Np;
membraneLinks[neighborSite] = neighbor + 18*Np;
membraneDist[localSite] = locdist;
membraneDist[neighborSite] = dist;
mlink++;
}
}
}
}
}
}
if (rank == 0) printf(" Create device data structures... \n");
/* Create device copies of data structures */
ScaLBL_AllocateDeviceMemory((void **)&MembraneLinks, 2*mlink*sizeof(int));
ScaLBL_AllocateDeviceMemory((void **)&MembraneCoef, 2*mlink*sizeof(double));
//ScaLBL_AllocateDeviceMemory((void **)&MembraneDistance, 2*mlink*sizeof(double));
ScaLBL_AllocateDeviceMemory((void **)&MembraneDistance, Nx*Ny*Nz*sizeof(double));
ScaLBL_CopyToDevice(NeighborList, neighborList, 18*Np*sizeof(int));
ScaLBL_CopyToDevice(MembraneLinks, membraneLinks, 2*mlink*sizeof(int));
//ScaLBL_CopyToDevice(MembraneDistance, membraneDist, 2*mlink*sizeof(double));
ScaLBL_CopyToDevice(MembraneDistance, Distance.data(), Nx*Ny*Nz*sizeof(double));
int *dvcTmpMap;
ScaLBL_AllocateDeviceMemory((void **)&dvcTmpMap, sizeof(int) * Np);
int *TmpMap;
TmpMap = new int[Np];
for (int k = 1; k < Nz - 1; k++) {
for (int j = 1; j < Ny - 1; j++) {
for (int i = 1; i < Nx - 1; i++) {
int idx = Map(i, j, k);
if (!(idx < 0))
TmpMap[idx] = k * Nx * Ny + j * Nx + i;
}
}
}
ScaLBL_CopyToDevice(dvcTmpMap, TmpMap, sizeof(int) * Np);
//int Membrane::D3Q7_MapRecv(int Cqx, int Cqy, int Cqz, int *d3q19_recvlist,
// int count, int *membraneRecvLabels, DoubleArray &Distance, int *dvcMap){
if (rank == 0) printf(" Construct communication data structures... \n");
/* Re-organize communication based on membrane structure*/
//...dvcMap recieve list for the X face: q=2,8,10,12,14 .................................
linkCount_X[0] = D3Q7_MapRecv(-1,0,0, dvcRecvDist_X,recvCount_X,dvcRecvLinks_X,Distance,dvcTmpMap);
//...................................................................................
//...dvcMap recieve list for the x face: q=1,7,9,11,13..................................
linkCount_x[0] = D3Q7_MapRecv(1,0,0, dvcRecvDist_x,recvCount_x,dvcRecvLinks_x,Distance,dvcTmpMap);
//...................................................................................
//...dvcMap recieve list for the y face: q=4,8,9,16,18 ...................................
linkCount_Y[0] = D3Q7_MapRecv(0,-1,0, dvcRecvDist_Y,recvCount_Y,dvcRecvLinks_Y,Distance,dvcTmpMap);
//...................................................................................
//...dvcMap recieve list for the Y face: q=3,7,10,15,17 ..................................
linkCount_y[0] = D3Q7_MapRecv(0,1,0, dvcRecvDist_y,recvCount_y,dvcRecvLinks_y,Distance,dvcTmpMap);
//...................................................................................
//...dvcMap recieve list for the z face<<<6,12,13,16,17)..............................................
linkCount_Z[0] = D3Q7_MapRecv(0,0,-1, dvcRecvDist_Z,recvCount_Z,dvcRecvLinks_Z,Distance,dvcTmpMap);
//...dvcMap recieve list for the Z face<<<5,11,14,15,18)..............................................
linkCount_z[0] = D3Q7_MapRecv(0,0,1, dvcRecvDist_z,recvCount_z,dvcRecvLinks_z,Distance,dvcTmpMap);
//..................................................................................
//......................................................................................
MPI_COMM_SCALBL.barrier();
ScaLBL_DeviceBarrier();
//.......................................................................
SendCount = sendCount_x+sendCount_X+sendCount_y+sendCount_Y+sendCount_z+sendCount_Z;
RecvCount = recvCount_x+recvCount_X+recvCount_y+recvCount_Y+recvCount_z+recvCount_Z;
CommunicationCount = SendCount+RecvCount;
//......................................................................................
//......................................................................................
// Allocate membrane coefficient buffers (for d3q7 recv)
ScaLBL_AllocateZeroCopy((void **) &coefficient_x, 2*(recvCount_x )*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &coefficient_X, 2*(recvCount_X)*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &coefficient_y, 2*(recvCount_y)*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &coefficient_Y, 2*(recvCount_Y)*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &coefficient_z, 2*(recvCount_z)*sizeof(double));
ScaLBL_AllocateZeroCopy((void **) &coefficient_Z, 2*(recvCount_Z)*sizeof(double));
//......................................................................................
ScaLBL_FreeDeviceMemory (dvcTmpMap);
delete [] neighborList;
delete [] TmpMap;
return mlink;
}
void Membrane::Write(string filename){
int mlink = membraneLinkCount;
std::ofstream ofs (filename, std::ofstream::out);
/* Create local copies of membrane data structures */
double *tmpMembraneCoef; // mass transport coefficient for the membrane
tmpMembraneCoef = new double [2*mlink*sizeof(double)];
ScaLBL_CopyToHost(tmpMembraneCoef, MembraneCoef, 2*mlink*sizeof(double));
int i,j,k;
for (int m=0; m<mlink; m++){
double a1 = tmpMembraneCoef[2*m];
double a2 = tmpMembraneCoef[2*m+1];
int m1 = membraneLinks[2*m]%Np;
int m2 = membraneLinks[2*m+1]%Np;
// map index to global i,j,k
k = m1/(Nx*Ny); j = (m1-Nx*Ny*k)/Nx; i = m1-Nx*Ny*k-Nx*j;
ofs << i << " " << j << " " << k << " "<< a1 ;
k = m2/(Nx*Ny); j = (m2-Nx*Ny*k)/Nx; i = m2-Nx*Ny*k-Nx*j;
ofs << i << " " << j << " " << k << " "<< a2 << endl;
}
ofs.close();
/*FILE *VELX_FILE;
sprintf(LocalRankFilename, "Velocity_X.%05i.raw", rank);
VELX_FILE = fopen(LocalRankFilename, "wb");
fwrite(PhaseField.data(), 8, N, VELX_FILE);
fclose(VELX_FILE);
*/
delete [] tmpMembraneCoef;
}
void Membrane::Read(string filename){
int mlink = membraneLinkCount;
/* Create local copies of membrane data structures */
double *tmpMembraneCoef; // mass transport coefficient for the membrane
tmpMembraneCoef = new double [2*mlink*sizeof(double)];
FILE *fid = fopen(filename.c_str(), "r");
INSIST(fid != NULL, "Error opening membrane file \n");
//........read the spheres..................
// We will read until a blank like or end-of-file is reached
int count = 0;
int i,j,k;
int ii,jj,kk;
double a1, a2;
while (fscanf(fid, "%i,%i,%i,%lf,%i,%i,%i,%lf,\n", &i, &j, &k, &a1, &ii, &jj, &kk, &a2) == 8){
printf("%i, %i, %i, %lf \n", i,j,k, a2);
count++;
}
if (count != mlink){
printf("WARNING (Membrane::Read): number of file lines does not match number of links \n");
}
fclose(fid);
ScaLBL_CopyToDevice(MembraneCoef, tmpMembraneCoef, 2*mlink*sizeof(double));
/*FILE *VELX_FILE;
sprintf(LocalRankFilename, "Velocity_X.%05i.raw", rank);
VELX_FILE = fopen(LocalRankFilename, "wb");
fwrite(PhaseField.data(), 8, N, VELX_FILE);
fclose(VELX_FILE);
*/
delete [] tmpMembraneCoef;
}
int Membrane::D3Q7_MapRecv(int Cqx, int Cqy, int Cqz, int *d3q19_recvlist,
int count, int *membraneRecvLabels, DoubleArray &Distance, int *dvcMap){
int i,j,k,n,nn,idx;
double distanceNonLocal,distanceLocal;
int * ReturnLabels;
ReturnLabels=new int [count];
int * list;
list=new int [count];
ScaLBL_CopyToHost(list, d3q19_recvlist, count*sizeof(int));
int *TmpMap;
TmpMap=new int [Np];
ScaLBL_CopyToHost(TmpMap, dvcMap, Np*sizeof(int));
int countMembraneLinks=0;
for (idx=0; idx<count; idx++){
//printf(" Read 1 \n");
// Get the value from the list -- note that n is the index is from the send (non-local) process
nn = list[idx]; // if (rank == 0) printf("@ rank:%d n=%d\n",rank,n);
//printf(" Read 2 \n");
n = TmpMap[nn];
//printf(" idx= %i(%i), nn=%i, n= %i \n",idx,count,nn,n);
// Get the 3-D indices from the send process
k = n/(Nx*Ny); j = (n-Nx*Ny*k)/Nx; i = n-Nx*Ny*k-Nx*j;
// if (rank ==0) printf("@ Get 3D indices from the send process: i=%d, j=%d, k=%d\n",i,j,k);
distanceLocal = Distance(i,j,k); // this site should be in the halo
//printf(" Local value %i, %i, %i \n",i,j,k);
// Streaming for the non-local distribution
i -= Cqx; j -= Cqy; k -= Cqz;
distanceNonLocal = Distance(i,j,k);
//printf(" Nonlocal value %i, %i, %i \n",i,j,k);
ReturnLabels[idx] = 0;
if (distanceLocal*distanceNonLocal < 0.0){
if (distanceLocal > 0.0) ReturnLabels[idx] = 1;
else ReturnLabels[idx] = 2;
countMembraneLinks++;
}
}
// Return updated version to the device
ScaLBL_CopyToDevice(membraneRecvLabels, ReturnLabels, count*sizeof(int));
// clean up the work arrays
delete [] ReturnLabels;
delete [] TmpMap;
delete [] list;
return countMembraneLinks;
}
void Membrane::SendD3Q7AA(double *dist){
if (Lock==true){
ERROR("Membrane Error (SendD3Q7): Membrane communicator is locked -- did you forget to match Send/Recv calls?");
}
else{
Lock=true;
}
// assign tag of 37 to D3Q7 communication
sendtag = recvtag = 37;
ScaLBL_DeviceBarrier();
// Pack the distributions
//...Packing for x face(q=2)................................
ScaLBL_D3Q19_Pack(2,dvcSendList_x,0,sendCount_x,sendbuf_x,dist,Np);
req2[0] = MPI_COMM_SCALBL.Irecv(recvbuf_X, recvCount_X,rank_X,recvtag);
req1[0] = MPI_COMM_SCALBL.Isend(sendbuf_x, sendCount_x,rank_x,sendtag);
//...Packing for X face(q=1)................................
ScaLBL_D3Q19_Pack(1,dvcSendList_X,0,sendCount_X,sendbuf_X,dist,Np);
req2[1] = MPI_COMM_SCALBL.Irecv(recvbuf_x, recvCount_x,rank_x,recvtag);
req1[1] = MPI_COMM_SCALBL.Isend(sendbuf_X, sendCount_X,rank_X,sendtag);
//for (int idx=0; idx<sendCount_X; idx++) printf(" SendX(%i)=%e \n",idx,sendbuf_X[idx]);
//...Packing for y face(q=4).................................
ScaLBL_D3Q19_Pack(4,dvcSendList_y,0,sendCount_y,sendbuf_y,dist,Np);
req2[2] = MPI_COMM_SCALBL.Irecv(recvbuf_Y, recvCount_Y,rank_Y,recvtag);
req1[2] = MPI_COMM_SCALBL.Isend(sendbuf_y, sendCount_y,rank_y,sendtag);
//...Packing for Y face(q=3).................................
ScaLBL_D3Q19_Pack(3,dvcSendList_Y,0,sendCount_Y,sendbuf_Y,dist,Np);
req2[3] = MPI_COMM_SCALBL.Irecv(recvbuf_y, recvCount_y,rank_y,recvtag);
req1[3] = MPI_COMM_SCALBL.Isend(sendbuf_Y, sendCount_Y,rank_Y,sendtag);
//for (int idx=0; idx<sendCount_Y; idx++) printf(" SendY(%i)=%e \n",idx,sendbuf_Y[idx]);
//...Packing for z face(q=6)................................
ScaLBL_D3Q19_Pack(6,dvcSendList_z,0,sendCount_z,sendbuf_z,dist,Np);
req2[4] = MPI_COMM_SCALBL.Irecv(recvbuf_Z, recvCount_Z,rank_Z,recvtag);
req1[4] = MPI_COMM_SCALBL.Isend(sendbuf_z, sendCount_z,rank_z,sendtag);
//...Packing for Z face(q=5)................................
ScaLBL_D3Q19_Pack(5,dvcSendList_Z,0,sendCount_Z,sendbuf_Z,dist,Np);
req2[5] = MPI_COMM_SCALBL.Irecv(recvbuf_z, recvCount_z,rank_z,recvtag);
req1[5] = MPI_COMM_SCALBL.Isend(sendbuf_Z, sendCount_Z,rank_Z,sendtag);
}
void Membrane::RecvD3Q7AA(double *dist){
//...................................................................................
// Wait for completion of D3Q19 communication
MPI_COMM_SCALBL.waitAll(6,req1);
MPI_COMM_SCALBL.waitAll(6,req2);
ScaLBL_DeviceBarrier();
//...................................................................................
// NOTE: AA Routine writes to opposite
// Unpack the distributions on the device
//...................................................................................
//...Unpacking for x face(q=2)................................
ScaLBL_D3Q7_Membrane_Unpack(2,dvcRecvDist_x, recvbuf_x,recvCount_x,dist,Np,coefficient_x);
//...................................................................................
//...Packing for X face(q=1)................................
ScaLBL_D3Q7_Membrane_Unpack(1,dvcRecvDist_X, recvbuf_X,recvCount_X,dist,Np,coefficient_X);
//...................................................................................
//...Packing for y face(q=4).................................
ScaLBL_D3Q7_Membrane_Unpack(4,dvcRecvDist_y, recvbuf_y,recvCount_y,dist,Np,coefficient_y);
//...................................................................................
//...Packing for Y face(q=3).................................
ScaLBL_D3Q7_Membrane_Unpack(3,dvcRecvDist_Y, recvbuf_Y,recvCount_Y,dist,Np,coefficient_Y);
//...................................................................................
//...Packing for z face(q=6)................................
ScaLBL_D3Q7_Membrane_Unpack(6,dvcRecvDist_z, recvbuf_z, recvCount_z,dist,Np,coefficient_z);
//...Packing for Z face(q=5)................................
ScaLBL_D3Q7_Membrane_Unpack(5,dvcRecvDist_Z, recvbuf_Z,recvCount_Z,dist,Np,coefficient_Z);
//..................................................................................
MPI_COMM_SCALBL.barrier();
//...................................................................................
Lock=false; // unlock the communicator after communications complete
//...................................................................................
}
void Membrane::IonTransport(double *dist, double *den){
ScaLBL_D3Q7_Membrane_IonTransport(MembraneLinks,MembraneCoef, dist, den, membraneLinkCount, Np);
}
// std::shared_ptr<Database> db){
void Membrane::AssignCoefficients(int *Map, double *Psi, double Threshold,
double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn,
double ThresholdMassFractionOut){
/* Assign mass transfer coefficients to the membrane data structure */
if (membraneLinkCount > 0)
ScaLBL_D3Q7_Membrane_AssignLinkCoef(MembraneLinks, Map, MembraneDistance, Psi, MembraneCoef,
Threshold, MassFractionIn, MassFractionOut, ThresholdMassFractionIn, ThresholdMassFractionOut,
membraneLinkCount, Nx, Ny, Nz, Np);
if (linkCount_X[0] < recvCount_X)
ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(-1,0,0,Map,MembraneDistance,Psi,Threshold,
MassFractionIn,MassFractionOut,ThresholdMassFractionIn,ThresholdMassFractionOut,
dvcRecvDist_X,dvcRecvLinks_X,coefficient_X,0,linkCount_X[0],recvCount_X,
Np,Nx,Ny,Nz);
if (linkCount_x[0] < recvCount_x)
ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(1,0,0,Map,MembraneDistance,Psi,Threshold,
MassFractionIn,MassFractionOut,ThresholdMassFractionIn,ThresholdMassFractionOut,
dvcRecvDist_x,dvcRecvLinks_x,coefficient_x,0,linkCount_x[0],recvCount_x,
Np,Nx,Ny,Nz);
if (linkCount_Y[0] < recvCount_Y)
ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(0,-1,0,Map,MembraneDistance,Psi,Threshold,
MassFractionIn,MassFractionOut,ThresholdMassFractionIn,ThresholdMassFractionOut,
dvcRecvDist_Y,dvcRecvLinks_Y,coefficient_Y,0,linkCount_Y[0],recvCount_Y,
Np,Nx,Ny,Nz);
if (linkCount_y[0]<recvCount_y)
ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(0,1,0,Map,MembraneDistance,Psi,Threshold,
MassFractionIn,MassFractionOut,ThresholdMassFractionIn,ThresholdMassFractionOut,
dvcRecvDist_y,dvcRecvLinks_y,coefficient_y,0,linkCount_y[0],recvCount_y,
Np,Nx,Ny,Nz);
if (linkCount_Z[0]<recvCount_Z)
ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(0,0,-1,Map,MembraneDistance,Psi,Threshold,
MassFractionIn,MassFractionOut,ThresholdMassFractionIn,ThresholdMassFractionOut,
dvcRecvDist_Z,dvcRecvLinks_Z,coefficient_Z,0,linkCount_Z[0],recvCount_Z,
Np,Nx,Ny,Nz);
if (linkCount_z[0]<recvCount_z)
ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(0,0,1,Map,MembraneDistance,Psi,Threshold,
MassFractionIn,MassFractionOut,ThresholdMassFractionIn,ThresholdMassFractionOut,
dvcRecvDist_z,dvcRecvLinks_z,coefficient_z,0,linkCount_z[0],recvCount_z,
Np,Nx,Ny,Nz);
}

190
common/Membrane.h Normal file
View File

@@ -0,0 +1,190 @@
/* Flow adaptor class for multiphase flow methods */
#ifndef ScaLBL_Membrane_INC
#define ScaLBL_Membrane_INC
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <iostream>
#include <exception>
#include <stdexcept>
#include <fstream>
#include "common/ScaLBL.h"
/**
* \brief Unpack D3Q19 distributions after communication using links determined based on membrane location
* @param q - index for distribution based on D3Q19 discrete velocity structure
* @param list - list of distributions to communicate
* @param recvbuf - memory buffer where recieved values have been stored
* @param count - number of values to unppack
* @param dist - memory buffer to hold the distributions
* @param N - size of the distributions (derived from Domain structure)
*/
extern "C" void ScaLBL_D3Q7_Membrane_Unpack(int q,
int *d3q7_recvlist, double *recvbuf, int count,
double *dist, int N, double *coef);
/**
* \brief Set custom link rules for D3Q19 distribution based on membrane location
* @param q - index for distribution based on D3Q19 discrete velocity structure
* @param list - list of distributions to communicate
* @param links - list of active links based on the membrane location
* @param coef - coefficient to determine the local mass transport for each membrane link
* @param start - index to start parsing the list
* @param offset - offset to start reading membrane links
* @param count - number of values to unppack
* @param recvbuf - memory buffer where recieved values have been stored
* @param dist - memory buffer to hold the distributions
* @param N - size of the distributions (derived from Domain structure)
*/
extern "C" void Membrane_D3Q19_Transport(int q, int *list, int *links, double *coef, int start, int offset,
int linkCount, double *recvbuf, double *dist, int N);
/**
* \class Membrane
* @brief
* The Membrane class operates on ScaLBL data structures to insert membrane
*
*/
class Membrane {
public:
int Np;
int Nx,Ny,Nz,N;
int membraneLinkCount;
int *initialNeighborList; // original neighborlist
int *NeighborList; // modified neighborlist
/* host data structures */
int *membraneLinks; // D3Q7 links that cross membrane
int *membraneTag; // label each link in the membrane
double *membraneDist; // distance to membrane for each linked site
double *membraneOrientation; // distance to membrane for each linked site
/*
* Device data structures
*/
int *MembraneLinks;
double *MembraneCoef; // mass transport coefficient for the membrane
double *MembraneDistance;
double *MembraneOrientation;
/**
* \brief Create a flow adaptor to operate on the LB model
* @param ScaLBL - originating data structures
* @param neighborList - list of neighbors for each site
*/
//Membrane(std::shared_ptr <Domain> Dm, int *initialNeighborList, int Nsites);
Membrane(std::shared_ptr <ScaLBL_Communicator> sComm, int *dvcNeighborList, int Nsites);
/**
* \brief Destructor
*/
~Membrane();
/**
* \brief Create membrane
* \details Create membrane structure from signed distance function
* @param Dm - domain structure
* @param Distance - signed distance to membrane
* @param Map - mapping between regular layout and compact layout
*/
int Create(DoubleArray &Distance, IntArray &Map);
/**
* \brief Write membrane data to output file
* @param filename - name of file to save
*/
void Write(string filename);
/**
* \brief Read membrane data from input file
* @param filename - name of file to save
*/
void Read(string filename);
void SendD3Q7AA(double *dist);
void RecvD3Q7AA(double *dist);
void AssignCoefficients(int *Map, double *Psi, double Threshold,
double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn,
double ThresholdMassFractionOut);
void IonTransport(double *dist, double *den);
//......................................................................................
// Buffers to store data sent and recieved by this MPI process
double *sendbuf_x, *sendbuf_y, *sendbuf_z, *sendbuf_X, *sendbuf_Y, *sendbuf_Z;
double *recvbuf_x, *recvbuf_y, *recvbuf_z, *recvbuf_X, *recvbuf_Y, *recvbuf_Z;
//......................................................................................
private:
bool Lock; // use Lock to make sure only one call at a time to protect data in transit
int sendtag, recvtag;
int iproc,jproc,kproc;
int nprocx,nprocy,nprocz;
// Give the object it's own MPI communicator
RankInfoStruct rank_info;
Utilities::MPI MPI_COMM_SCALBL; // MPI Communicator for this domain
MPI_Request req1[18],req2[18];
/**
* \brief Set up membrane communication
* \details associate p2p communication links to membrane where necessary
* returns the number of membrane links
* regular communications are stored in the first part of the list
* membrane communications are stored in the last part of the list
* @param Cqx - discrete velocity (x)
* @param Cqy - discrete velocity (y)
* @param Cqz - discrete velocity (z)
* @param d3q19_recvlist - device array with the saved list
* @param count - number recieved values
* @param membraneRecvLabels - sorted list with regular and membrane links
* @param Distance - signed distance to membrane
* @param dvcMap - data structure used to define mapping between dense and sparse representation
* @param Np - number of sites in dense representation
* */
int D3Q7_MapRecv(int Cqx, int Cqy, int Cqz, int *d3q19_recvlist,
int count, int *membraneRecvLabels, DoubleArray &Distance, int *dvcMap);
//......................................................................................
// MPI ranks for all 18 neighbors
//......................................................................................
// These variables are all private to prevent external things from modifying them!!
//......................................................................................
int rank;
int rank_x,rank_y,rank_z,rank_X,rank_Y,rank_Z;
int rank_xy,rank_XY,rank_xY,rank_Xy;
int rank_xz,rank_XZ,rank_xZ,rank_Xz;
int rank_yz,rank_YZ,rank_yZ,rank_Yz;
//......................................................................................
int SendCount, RecvCount, CommunicationCount;
//......................................................................................
int sendCount_x, sendCount_y, sendCount_z, sendCount_X, sendCount_Y, sendCount_Z;
//......................................................................................
int recvCount_x, recvCount_y, recvCount_z, recvCount_X, recvCount_Y, recvCount_Z;
//......................................................................................
int linkCount_x[5], linkCount_y[5], linkCount_z[5], linkCount_X[5], linkCount_Y[5], linkCount_Z[5];
int linkCount_xy, linkCount_yz, linkCount_xz, linkCount_Xy, linkCount_Yz, linkCount_xZ;
int linkCount_xY, linkCount_yZ, linkCount_Xz, linkCount_XY, linkCount_YZ, linkCount_XZ;
//......................................................................................
// Send buffers that reside on the compute device
int *dvcSendList_x, *dvcSendList_y, *dvcSendList_z, *dvcSendList_X, *dvcSendList_Y, *dvcSendList_Z;
//int *dvcSendList_xy, *dvcSendList_yz, *dvcSendList_xz, *dvcSendList_Xy, *dvcSendList_Yz, *dvcSendList_xZ;
//int *dvcSendList_xY, *dvcSendList_yZ, *dvcSendList_Xz, *dvcSendList_XY, *dvcSendList_YZ, *dvcSendList_XZ;
// Recieve buffers that reside on the compute device
int *dvcRecvList_x, *dvcRecvList_y, *dvcRecvList_z, *dvcRecvList_X, *dvcRecvList_Y, *dvcRecvList_Z;
//int *dvcRecvList_xy, *dvcRecvList_yz, *dvcRecvList_xz, *dvcRecvList_Xy, *dvcRecvList_Yz, *dvcRecvList_xZ;
//int *dvcRecvList_xY, *dvcRecvList_yZ, *dvcRecvList_Xz, *dvcRecvList_XY, *dvcRecvList_YZ, *dvcRecvList_XZ;
// Link lists that reside on the compute device
int *dvcRecvLinks_x, *dvcRecvLinks_y, *dvcRecvLinks_z, *dvcRecvLinks_X, *dvcRecvLinks_Y, *dvcRecvLinks_Z;
//int *dvcRecvLinks_xy, *dvcRecvLinks_yz, *dvcRecvLinks_xz, *dvcRecvLinks_Xy, *dvcRecvLinks_Yz, *dvcRecvLinks_xZ;
//int *dvcRecvLinks_xY, *dvcRecvLinks_yZ, *dvcRecvLinks_Xz, *dvcRecvLinks_XY, *dvcRecvLinks_YZ, *dvcRecvLinks_XZ;
// Recieve buffers for the distributions
int *dvcRecvDist_x, *dvcRecvDist_y, *dvcRecvDist_z, *dvcRecvDist_X, *dvcRecvDist_Y, *dvcRecvDist_Z;
//int *dvcRecvDist_xy, *dvcRecvDist_yz, *dvcRecvDist_xz, *dvcRecvDist_Xy, *dvcRecvDist_Yz, *dvcRecvDist_xZ;
//int *dvcRecvDist_xY, *dvcRecvDist_yZ, *dvcRecvDist_Xz, *dvcRecvDist_XY, *dvcRecvDist_YZ, *dvcRecvDist_XZ;
//......................................................................................
// mass transfer coefficient arrays
double *coefficient_x, *coefficient_X, *coefficient_y, *coefficient_Y, *coefficient_z, *coefficient_Z;
//......................................................................................
};
#endif

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -69,7 +69,7 @@ void Utilities::startup(int argc, char **argv, bool multiple) {
"thread support, thread support will be disabled"
<< std::endl;
}
StackTrace::globalCallStackInitialize(MPI_COMM_WORLD);
//StackTrace::globalCallStackInitialize(MPI_COMM_WORLD);
} else {
MPI_Init(&argc, &argv);
}
@@ -86,7 +86,7 @@ void Utilities::shutdown() {
int rank = 0;
#ifdef USE_MPI
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
StackTrace::globalCallStackFinalize();
//StackTrace::globalCallStackFinalize();
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
#endif

View File

@@ -173,8 +173,7 @@
_Pragma( "GCC diagnostic ignored \"-Wunused-local-typedefs\"" ) \
_Pragma( "GCC diagnostic ignored \"-Woverloaded-virtual\"" ) \
_Pragma( "GCC diagnostic ignored \"-Wunused-parameter\"" ) \
_Pragma( "GCC diagnostic ignored \"-Warray-bounds\"" ) \
_Pragma( "GCC diagnostic ignored \"-Wterminate\"" )
_Pragma( "GCC diagnostic ignored \"-Warray-bounds\"" )
#define ENABLE_WARNINGS _Pragma( "GCC diagnostic pop" )
#else
#define DISABLE_WARNINGS

View File

@@ -48,6 +48,7 @@ extern "C" void ScaLBL_D3Q19_Unpack(int q, int *list, int start, int count,
}
}
extern "C" void ScaLBL_D3Q19_AA_Init(double *f_even, double *f_odd, int Np) {
int n;
for (n = 0; n < Np; n++) {
@@ -1883,7 +1884,7 @@ extern "C" void ScaLBL_D3Q19_AAodd_MRT(int *neighborList, double *dist,
}
}
extern "C" void ScaLBL_D3Q19_AAeven_Compact(char *ID, double *dist, int Np) {
extern "C" void ScaLBL_D3Q19_AAeven_Compact(double *dist, int Np) {
for (int n = 0; n < Np; n++) {
@@ -1941,7 +1942,7 @@ extern "C" void ScaLBL_D3Q19_AAeven_Compact(char *ID, double *dist, int Np) {
}
}
extern "C" void ScaLBL_D3Q19_AAodd_Compact(char *ID, int *neighborList,
extern "C" void ScaLBL_D3Q19_AAodd_Compact(int *neighborList,
double *dist, int Np) {
int nread;

View File

@@ -35,13 +35,31 @@ extern "C" void ScaLBL_Solid_Neumann_D3Q7(double *dist, double *BoundaryValue,
}
}
extern "C" void ScaLBL_Solid_SlippingVelocityBC_D3Q19(
double *dist, double *zeta_potential, double *ElectricField,
double *SolidGrad, double epsilon_LB, double tau, double rho0,
double den_scale, double h, double time_conv, int *BounceBackDist_list,
int *BounceBackSolid_list, int *FluidBoundary_list, double *lattice_weight,
float *lattice_cx, float *lattice_cy, float *lattice_cz, int count,
int Np) {
extern "C" void ScaLBL_Solid_DirichletAndNeumann_D3Q7(double *dist,double *BoundaryValue,int* BoundaryLabel,int *BounceBackDist_list,int *BounceBackSolid_list,int N){
int idx;
int iq,ib;
double value_b,value_b_label,value_q;
for (idx=0; idx<N; idx++){
iq = BounceBackDist_list[idx];
ib = BounceBackSolid_list[idx];
value_b = BoundaryValue[ib];//get boundary value from a solid site
value_b_label = BoundaryLabel[ib];//get boundary label (i.e. type of BC) from a solid site
value_q = dist[iq];
if (value_b_label==1){//Dirichlet BC
dist[iq] = -1.0*value_q + value_b*0.25;//NOTE 0.25 is the speed of sound for D3Q7 lattice
}
if (value_b_label==2){//Neumann BC
dist[iq] = value_q + value_b;
}
}
}
extern "C" void ScaLBL_Solid_SlippingVelocityBC_D3Q19(double *dist, double *zeta_potential, double *ElectricField, double *SolidGrad,
double epsilon_LB, double tau, double rho0,double den_scale, double h, double time_conv,
int *BounceBackDist_list, int *BounceBackSolid_list, int *FluidBoundary_list,
double *lattice_weight, float *lattice_cx, float *lattice_cy, float *lattice_cz,
int count, int Np){
int idx;
int iq, ib, ifluidBC;
@@ -163,6 +181,7 @@ extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z(int *d_neighborList,
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z(int *d_neighborList,
int *list,
double *dist,
@@ -195,6 +214,8 @@ extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z(int *d_neighborList,
}
}
extern "C" void ScaLBL_Poisson_D3Q7_BC_z(int *list, int *Map, double *Psi,
double Vin, int count) {
int idx, n, nm;

View File

@@ -1,4 +1,144 @@
#include <stdio.h>
#include <math.h>
extern "C" void ScaLBL_D3Q7_Membrane_AssignLinkCoef(int *membrane, int *Map, double *Distance, double *Psi, double *coef,
double Threshold, double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int memLinks, int Nx, int Ny, int Nz, int Np){
int link,iq,ip,nq,np,nqm,npm;
double aq, ap, membranePotential;
//double dq, dp, dist, orientation;
/* Interior Links */
for (link=0; link<memLinks; link++){
// inside //outside
aq = MassFractionIn; ap = MassFractionOut;
iq = membrane[2*link]; ip = membrane[2*link+1];
nq = iq%Np; np = ip%Np;
nqm = Map[nq]; npm = Map[np]; // strided layout
//dq = Distance[nqm]; dp = Distance[npm];
/* orientation for link to distance gradient*/
//orientation = 1.0/fabs(dq - dp);
/* membrane potential for this link */
membranePotential = Psi[nqm] - Psi[npm];
if (membranePotential > Threshold){
aq = ThresholdMassFractionIn; ap = ThresholdMassFractionOut;
}
/* Save the mass transfer coefficients */
//coef[2*link] = aq*orientation; coef[2*link+1] = ap*orientation;
coef[2*link] = aq; coef[2*link+1] = ap;
}
}
extern "C" void ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(
const int Cqx, const int Cqy, int const Cqz,
int *Map, double *Distance, double *Psi, double Threshold,
double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int *d3q7_recvlist, int *d3q7_linkList, double *coef, int start, int nlinks, int count,
const int N, const int Nx, const int Ny, const int Nz) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, label, nqm, npm, i, j, k;
double distanceLocal;//, distanceNonlocal;
double psiLocal, psiNonlocal, membranePotential;
double ap,aq; // coefficient
for (idx = 0; idx < count; idx++) {
n = d3q7_recvlist[idx];
label = d3q7_linkList[idx];
ap = 1.0; // regular streaming rule
aq = 1.0;
if (label > 0 && !(n < 0)){
nqm = Map[n];
distanceLocal = Distance[nqm];
psiLocal = Psi[nqm];
// Get the 3-D indices from the send process
k = nqm/(Nx*Ny); j = (nqm-Nx*Ny*k)/Nx; i = nqm-Nx*Ny*k-Nx*j;
// Streaming link the non-local distribution
i -= Cqx; j -= Cqy; k -= Cqz;
npm = k*Nx*Ny + j*Nx + i;
//distanceNonlocal = Distance[npm];
psiNonlocal = Psi[npm];
membranePotential = psiLocal - psiNonlocal;
aq = MassFractionIn;
ap = MassFractionOut;
/* link is inside membrane */
if (distanceLocal > 0.0){
if (membranePotential < Threshold*(-1.0)){
ap = MassFractionIn;
aq = MassFractionOut;
}
else {
ap = ThresholdMassFractionIn;
aq = ThresholdMassFractionOut;
}
}
else if (membranePotential > Threshold){
aq = ThresholdMassFractionIn;
ap = ThresholdMassFractionOut;
}
}
coef[2*idx]=aq;
coef[2*idx+1]=ap;
}
}
extern "C" void ScaLBL_D3Q7_Membrane_Unpack(int q,
int *d3q7_recvlist, double *recvbuf, int count,
double *dist, int N, double *coef) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx;
double fq,fp,fqq,ap,aq; // coefficient
/* First unpack the regular links */
for (idx = 0; idx < count; idx++) {
n = d3q7_recvlist[idx];
// update link based on mass transfer coefficients
if (!(n < 0)){
aq = coef[2*idx];
ap = coef[2*idx+1];
fq = dist[q * N + n];
fp = recvbuf[idx];
fqq = (1-aq)*fq+ap*fp;
dist[q * N + n] = fqq;
}
//printf(" LINK: site=%i, index=%i \n", n, idx);
}
}
extern "C" void ScaLBL_D3Q7_Membrane_IonTransport(int *membrane, double *coef,
double *dist, double *Den, int memLinks, int Np){
int link,iq,ip,nq,np;
double aq, ap, fq, fp, fqq, fpp, Cq, Cp;
for (link=0; link<memLinks; link++){
// inside //outside
aq = coef[2*link]; ap = coef[2*link+1];
iq = membrane[2*link]; ip = membrane[2*link+1];
nq = iq%Np; np = ip%Np;
fq = dist[iq]; fp = dist[ip];
fqq = (1-aq)*fq+ap*fp; fpp = (1-ap)*fp+aq*fq;
Cq = Den[nq]; Cp = Den[np];
Cq += fqq - fq; Cp += fpp - fp;
Den[nq] = Cq; Den[np] = Cp;
dist[iq] = fqq; dist[ip] = fpp;
}
}
extern "C" void ScaLBL_D3Q7_AAodd_IonConcentration(int *neighborList,
double *dist, double *Den,
@@ -85,7 +225,7 @@ extern "C" void ScaLBL_D3Q7_AAeven_IonConcentration(double *dist, double *Den,
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist,
extern "C" void ScaLBL_D3Q7_AAodd_Ion_v0(int *neighborList, double *dist,
double *Den, double *FluxDiffusive,
double *FluxAdvective,
double *FluxElectrical, double *Velocity,
@@ -99,6 +239,7 @@ extern "C" void ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist,
double Ex, Ey, Ez; //electrical field
double flux_diffusive_x, flux_diffusive_y, flux_diffusive_z;
double f0, f1, f2, f3, f4, f5, f6;
//double X,Y,Z,factor_x, factor_y, factor_z;
int nr1, nr2, nr3, nr4, nr5, nr6;
for (n = start; n < finish; n++) {
@@ -137,6 +278,7 @@ extern "C" void ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist,
f6 = dist[nr6];
// compute diffusive flux
//Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
@@ -149,37 +291,55 @@ extern "C" void ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist,
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
//Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
//X = 4.0 * (ux + uEPx);
//Y = 4.0 * (uy + uEPy);
//Z = 4.0 * (uz + uEPz);
//factor_x = X / sqrt(1 + X*X);
//factor_y = Y / sqrt(1 + Y*Y);
//factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[nr2] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
// q=2
dist[nr1] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
// q = 3
dist[nr4] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y );
// q = 4
dist[nr3] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
// q = 5
dist[nr6] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
// q = 6
dist[nr5] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
// f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion(
extern "C" void ScaLBL_D3Q7_AAeven_Ion_v0(
double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective,
double *FluxElectrical, double *Velocity, double *ElectricField, double Di,
int zi, double rlx, double Vt, int start, int finish, int Np) {
@@ -190,6 +350,7 @@ extern "C" void ScaLBL_D3Q7_AAeven_Ion(
double Ex, Ey, Ez; //electrical field
double flux_diffusive_x, flux_diffusive_y, flux_diffusive_z;
double f0, f1, f2, f3, f4, f5, f6;
//double X,Y,Z, factor_x, factor_y, factor_z;
for (n = start; n < finish; n++) {
@@ -214,6 +375,7 @@ extern "C" void ScaLBL_D3Q7_AAeven_Ion(
f6 = dist[5 * Np + n];
// compute diffusive flux
//Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
@@ -226,33 +388,258 @@ extern "C" void ScaLBL_D3Q7_AAeven_Ion(
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
//Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
//X = 4.0 * (ux + uEPx);
//Y = 4.0 * (uy + uEPy);
//Z = 4.0 * (uz + uEPz);
//factor_x = X / sqrt(1 + X*X);
//factor_y = Y / sqrt(1 + Y*Y);
//factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[1 * Np + n] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
// q=2
dist[2 * Np + n] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
// q = 3
dist[3 * Np + n] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y);
// q = 4
dist[4 * Np + n] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
// q = 5
dist[5 * Np + n] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
// q = 6
dist[6 * Np + n] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
// f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist,
double *Den, double *FluxDiffusive,
double *FluxAdvective,
double *FluxElectrical, double *Velocity,
double *ElectricField, double Di, int zi,
double rlx, double Vt, int start,
int finish, int Np) {
int n;
double Ci;
double ux, uy, uz;
double uEPx, uEPy, uEPz; //electrochemical induced velocity
double Ex, Ey, Ez; //electrical field
double flux_diffusive_x, flux_diffusive_y, flux_diffusive_z;
double f0, f1, f2, f3, f4, f5, f6;
//double X,Y,Z,factor_x, factor_y, factor_z;
int nr1, nr2, nr3, nr4, nr5, nr6;
for (n = start; n < finish; n++) {
//Load data
//Ci = Den[n];
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
// q=2
nr2 = neighborList[n + Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n + 2 * Np]; // neighbor 4
f3 = dist[nr3];
// q=4
nr4 = neighborList[n + 3 * Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n + 4 * Np];
f5 = dist[nr5];
// q=6
nr6 = neighborList[n + 5 * Np];
f6 = dist[nr6];
// compute diffusive flux
Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
//X = 4.0 * (ux + uEPx);
//Y = 4.0 * (uy + uEPy);
//Z = 4.0 * (uz + uEPz);
//factor_x = X / sqrt(1 + X*X);
//factor_y = Y / sqrt(1 + Y*Y);
//factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[nr2] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
// q=2
dist[nr1] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
// q = 3
dist[nr4] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y );
// q = 4
dist[nr3] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
// q = 5
dist[nr6] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
// q = 6
dist[nr5] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
// f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion(
double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective,
double *FluxElectrical, double *Velocity, double *ElectricField, double Di,
int zi, double rlx, double Vt, int start, int finish, int Np) {
int n;
double Ci;
double ux, uy, uz;
double uEPx, uEPy, uEPz; //electrochemical induced velocity
double Ex, Ey, Ez; //electrical field
double flux_diffusive_x, flux_diffusive_y, flux_diffusive_z;
double f0, f1, f2, f3, f4, f5, f6;
//double X,Y,Z, factor_x, factor_y, factor_z;
for (n = start; n < finish; n++) {
//Load data
//Ci = Den[n];
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
// compute diffusive flux
Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
//X = 4.0 * (ux + uEPx);
//Y = 4.0 * (uy + uEPy);
//Z = 4.0 * (uz + uEPz);
//factor_x = X / sqrt(1 + X*X);
//factor_y = Y / sqrt(1 + Y*Y);
//factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[1 * Np + n] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
// q=2
dist[2 * Np + n] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
// q = 3
dist[3 * Np + n] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y);
// q = 4
dist[4 * Np + n] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
// q = 5
dist[5 * Np + n] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
// q = 6
dist[6 * Np + n] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
// f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
@@ -289,7 +676,7 @@ extern "C" void ScaLBL_D3Q7_Ion_Init_FromFile(double *dist, double *Den,
extern "C" void ScaLBL_D3Q7_Ion_ChargeDensity(double *Den,
double *ChargeDensity,
int IonValence, int ion_component,
double IonValence, int ion_component,
int start, int finish, int Np) {
int n;

42
cpu/MembraneHelper.cpp Normal file
View File

@@ -0,0 +1,42 @@
extern "C" void Membrane_D3Q19_Unpack(int q, int *list, int *links, int start, int linkCount,
double *recvbuf, double *dist, int N) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, link;
for (link=0; link<linkCount; link++){
idx = links[start+link];
// Get the value from the list -- note that n is the index is from the send (non-local) process
n = list[start + idx];
// unpack the distribution to the proper location
if (!(n < 0))
dist[q * N + n] = recvbuf[start + idx];
}
}
extern "C" void Membrane_D3Q19_Transport(int q, int *list, int *links, double *coef, int start, int offset,
int linkCount, double *recvbuf, double *dist, int N){
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, link;
double alpha;
for (link=offset; link<linkCount; link++){
idx = list[start+link];
// Get the value from the list -- note that n is the index is from the send (non-local) process
n = list[start + idx];
alpha = coef[start + idx];
// unpack the distribution to the proper location
if (!(n < 0))
dist[q * N + n] = alpha*recvbuf[start + idx];
}
}

View File

@@ -1,3 +1,4 @@
#include <math.h>
extern "C" void
ScaLBL_D3Q7_AAodd_Poisson_ElectricPotential(int *neighborList, int *Map,
@@ -95,7 +96,7 @@ extern "C" void ScaLBL_D3Q7_AAeven_Poisson_ElectricPotential(
extern "C" void ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map,
double *dist, double *Den_charge,
double *Psi, double *ElectricField,
double tau, double epsilon_LB,
double tau, double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
int n;
double psi; //electric potential
@@ -109,8 +110,9 @@ extern "C" void ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map,
for (n = start; n < finish; n++) {
//Load data
rho_e = Den_charge[n];
rho_e = rho_e / epsilon_LB;
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and thus the net space charge density is zero.
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
idx = Map[n];
psi = Psi[idx];
@@ -175,8 +177,8 @@ extern "C" void ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map,
extern "C" void ScaLBL_D3Q7_AAeven_Poisson(int *Map, double *dist,
double *Den_charge, double *Psi,
double *ElectricField, double tau,
double epsilon_LB, int start,
int finish, int Np) {
double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
int n;
double psi; //electric potential
double Ex, Ey, Ez; //electric field
@@ -188,8 +190,9 @@ extern "C" void ScaLBL_D3Q7_AAeven_Poisson(int *Map, double *dist,
for (n = start; n < finish; n++) {
//Load data
rho_e = Den_charge[n];
rho_e = rho_e / epsilon_LB;
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and thus the net space charge density is zero.
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
idx = Map[n];
psi = Psi[idx];
@@ -442,34 +445,591 @@ extern "C" void ScaLBL_D3Q7_PoissonResidualError(
// }
//}
//extern "C" void ScaLBL_D3Q7_Poisson_getElectricField(double *dist, double *ElectricField, double tau, int Np){
// int n;
// // distributions
// double f1,f2,f3,f4,f5,f6;
// double Ex,Ey,Ez;
// double rlx=1.0/tau;
//
// for (n=0; n<Np; n++){
// //........................................................................
// // Registers to store the distributions
// //........................................................................
// f1 = dist[Np+n];
// f2 = dist[2*Np+n];
// f3 = dist[3*Np+n];
// f4 = dist[4*Np+n];
// f5 = dist[5*Np+n];
// f6 = dist[6*Np+n];
// //.................Compute the Electric Field...................................
// //Ex = (f1-f2)*rlx*4.5;//NOTE the unit of electric field here is V/lu
// //Ey = (f3-f4)*rlx*4.5;
// //Ez = (f5-f6)*rlx*4.5;
// Ex = (f1-f2)*rlx*4.0;//NOTE the unit of electric field here is V/lu
// Ey = (f3-f4)*rlx*4.0;
// Ez = (f5-f6)*rlx*4.0;
// //..................Write the Electric Field.....................................
// ElectricField[0*Np+n] = Ex;
// ElectricField[1*Np+n] = Ey;
// ElectricField[2*Np+n] = Ez;
// //........................................................................
// }
//}
extern "C" void ScaLBL_D3Q19_Poisson_getElectricField(double *dist, double *ElectricField, double tau, int Np){
int n;
double f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15,
f16, f17, f18;
double Ex,Ey,Ez;
double rlx=1.0/tau;
for (n=0; n<Np; n++){
//........................................................................
// Registers to store the distributions
//........................................................................
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
f7 = dist[8 * Np + n];
f8 = dist[7 * Np + n];
f9 = dist[10 * Np + n];
f10 = dist[9 * Np + n];
f11 = dist[12 * Np + n];
f12 = dist[11 * Np + n];
f13 = dist[14 * Np + n];
f14 = dist[13 * Np + n];
f15 = dist[16 * Np + n];
f16 = dist[15 * Np + n];
f17 = dist[18 * Np + n];
f18 = dist[17 * Np + n];
//.................Compute the Electric Field...................................
Ex = (f1 - f2 + f7 - f8 + f9 - f10 + f11 - f12 + f13 - f14)*rlx*3.0;//NOTE the unit of electric field here is V/lu
Ey = (f3 - f4 + f7 - f8 - f9 + f10 + f15 - f16 + f17 - f18)*rlx*3.0;
Ez = (f5 - f6 + f11 - f12 - f13 + f14 + f15 - f16 - f17 + f18)*rlx*3.0;
//..................Write the Electric Field.....................................
ElectricField[0*Np+n] = Ex;
ElectricField[1*Np+n] = Ey;
ElectricField[2*Np+n] = Ez;
//........................................................................
}
}
extern "C" void
ScaLBL_D3Q19_AAodd_Poisson_ElectricPotential(int *neighborList, int *Map,
double *dist, double *Den_charge, double *Psi,
double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
int n;
double psi; //electric potential
double rho_e; //local charge density
//double Gs;
double f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15,
f16, f17, f18;
int nr1, nr2, nr3, nr4, nr5, nr6, nr7, nr8, nr9, nr10, nr11, nr12, nr13,
nr14, nr15, nr16, nr17, nr18;
int idx;
for (n = start; n < finish; n++) {
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
nr2 = neighborList[n + Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n + 2 * Np]; // neighbor 4
f3 = dist[nr3];
// q = 4
nr4 = neighborList[n + 3 * Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n + 4 * Np];
f5 = dist[nr5];
// q = 6
nr6 = neighborList[n + 5 * Np];
f6 = dist[nr6];
// q=7
nr7 = neighborList[n + 6 * Np];
f7 = dist[nr7];
// q = 8
nr8 = neighborList[n + 7 * Np];
f8 = dist[nr8];
// q=9
nr9 = neighborList[n + 8 * Np];
f9 = dist[nr9];
// q = 10
nr10 = neighborList[n + 9 * Np];
f10 = dist[nr10];
// q=11
nr11 = neighborList[n + 10 * Np];
f11 = dist[nr11];
// q=12
nr12 = neighborList[n + 11 * Np];
f12 = dist[nr12];
// q=13
nr13 = neighborList[n + 12 * Np];
f13 = dist[nr13];
// q=14
nr14 = neighborList[n + 13 * Np];
f14 = dist[nr14];
// q=15
nr15 = neighborList[n + 14 * Np];
f15 = dist[nr15];
// q=16
nr16 = neighborList[n + 15 * Np];
f16 = dist[nr16];
// q=17
//fq = dist[18*Np+n];
nr17 = neighborList[n + 16 * Np];
f17 = dist[nr17];
// q=18
nr18 = neighborList[n + 17 * Np];
f18 = dist[nr18];
psi = f0 + f2 + f1 + f4 + f3 + f6 + f5 + f8 + f7 + f10 + f9 + f12 +
f11 + f14 + f13 + f16 + f15 + f18 + f17;
idx = Map[n];
Psi[idx] = psi - 0.5*rho_e;
}
}
extern "C" void ScaLBL_D3Q19_AAeven_Poisson_ElectricPotential(
int *Map, double *dist, double *Den_charge, double *Psi, double epsilon_LB, bool UseSlippingVelBC, int start, int finish, int Np) {
int n;
double psi; //electric potential
double rho_e; //local charge density
double f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15,
f16, f17, f18;
//double Gs;
int idx;
for (n = start; n < finish; n++) {
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
//........................................................................
// q=0
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
f7 = dist[8 * Np + n];
f8 = dist[7 * Np + n];
f9 = dist[10 * Np + n];
f10 = dist[9 * Np + n];
f11 = dist[12 * Np + n];
f12 = dist[11 * Np + n];
f13 = dist[14 * Np + n];
f14 = dist[13 * Np + n];
f15 = dist[16 * Np + n];
f16 = dist[15 * Np + n];
f17 = dist[18 * Np + n];
f18 = dist[17 * Np + n];
psi = f0 + f2 + f1 + f4 + f3 + f6 + f5 + f8 + f7 + f10 + f9 + f12 +
f11 + f14 + f13 + f16 + f15 + f18 + f17;
idx = Map[n];
Psi[idx] = psi - 0.5*rho_e;
}
}
extern "C" void ScaLBL_D3Q19_AAodd_Poisson(int *neighborList, int *Map,
double *dist, double *Den_charge,
double *Psi, double *ElectricField,
double tau, double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
int n;
double psi; //electric potential
double Ex, Ey, Ez; //electric field
double rho_e; //local charge density
double f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15,
f16, f17, f18;
int nr1, nr2, nr3, nr4, nr5, nr6, nr7, nr8, nr9, nr10, nr11, nr12, nr13,
nr14, nr15, nr16, nr17, nr18;
double sum_q;
double rlx = 1.0 / tau;
int idx;
double W0 = 0.5;
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
for (n = start; n < finish; n++) {
//Load data
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and thus the net space charge density is zero.
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
nr2 = neighborList[n + Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n + 2 * Np]; // neighbor 4
f3 = dist[nr3];
// q = 4
nr4 = neighborList[n + 3 * Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n + 4 * Np];
f5 = dist[nr5];
// q = 6
nr6 = neighborList[n + 5 * Np];
f6 = dist[nr6];
// q=7
nr7 = neighborList[n + 6 * Np];
f7 = dist[nr7];
// q = 8
nr8 = neighborList[n + 7 * Np];
f8 = dist[nr8];
// q=9
nr9 = neighborList[n + 8 * Np];
f9 = dist[nr9];
// q = 10
nr10 = neighborList[n + 9 * Np];
f10 = dist[nr10];
// q=11
nr11 = neighborList[n + 10 * Np];
f11 = dist[nr11];
// q=12
nr12 = neighborList[n + 11 * Np];
f12 = dist[nr12];
// q=13
nr13 = neighborList[n + 12 * Np];
f13 = dist[nr13];
// q=14
nr14 = neighborList[n + 13 * Np];
f14 = dist[nr14];
// q=15
nr15 = neighborList[n + 14 * Np];
f15 = dist[nr15];
// q=16
nr16 = neighborList[n + 15 * Np];
f16 = dist[nr16];
// q=17
//fq = dist[18*Np+n];
nr17 = neighborList[n + 16 * Np];
f17 = dist[nr17];
// q=18
nr18 = neighborList[n + 17 * Np];
f18 = dist[nr18];
sum_q = f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15+f16+f17+f18;
//error = 8.0*(sum_q - f0) + rho_e;
psi = 2.0*(f0*(1.0 - rlx) + rlx*(sum_q + 0.125*rho_e));
idx = Map[n];
Psi[idx] = psi;
Ex = (f1 - f2 + 0.5*(f7 - f8 + f9 - f10 + f11 - f12 + f13 - f14))*4.0; //NOTE the unit of electric field here is V/lu
Ey = (f3 - f4 + 0.5*(f7 - f8 - f9 + f10 + f15 - f16 + f17 - f18))*4.0;
Ez = (f5 - f6 + 0.5*(f11 - f12 - f13 + f14 + f15 - f16 - f17 + f18))*4.0;
ElectricField[n + 0 * Np] = Ex;
ElectricField[n + 1 * Np] = Ey;
ElectricField[n + 2 * Np] = Ez;
// q = 0
dist[n] = W0*psi; //f0 * (1.0 - rlx) - (1.0-0.5*rlx)*W0*rho_e;
// q = 1
dist[nr2] = W1*psi; //f1 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 2
dist[nr1] = W1*psi; //f2 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 3
dist[nr4] = W1*psi; //f3 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 4
dist[nr3] = W1*psi; //f4 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 5
dist[nr6] = W1*psi; //f5 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 6
dist[nr5] = W1*psi; //f6 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
//........................................................................
// q = 7
dist[nr8] = W2*psi; //f7 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 8
dist[nr7] = W2*psi; //f8 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 9
dist[nr10] = W2*psi; //f9 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 10
dist[nr9] = W2*psi; //f10 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 11
dist[nr12] = W2*psi; //f11 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 12
dist[nr11] = W2*psi; //f12 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 13
dist[nr14] = W2*psi; //f13 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q= 14
dist[nr13] = W2*psi; //f14 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 15
dist[nr16] = W2*psi; //f15 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 16
dist[nr15] = W2*psi; //f16 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 17
dist[nr18] = W2*psi; //f17 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 18
dist[nr17] = W2*psi; //f18 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
}
}
extern "C" void ScaLBL_D3Q19_AAeven_Poisson(int *Map, double *dist,
double *Den_charge, double *Psi,
double *ElectricField, double *Error, double tau,
double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
int n;
double psi; //electric potential
double Ex, Ey, Ez; //electric field
double rho_e; //local charge density
double f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15,
f16, f17, f18;
double error,sum_q;
double rlx = 1.0 / tau;
int idx;
double W0 = 0.5;
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
for (n = start; n < finish; n++) {
//Load data
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and thus the net space charge density is zero.
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
f7 = dist[8 * Np + n];
f8 = dist[7 * Np + n];
f9 = dist[10 * Np + n];
f10 = dist[9 * Np + n];
f11 = dist[12 * Np + n];
f12 = dist[11 * Np + n];
f13 = dist[14 * Np + n];
f14 = dist[13 * Np + n];
f15 = dist[16 * Np + n];
f16 = dist[15 * Np + n];
f17 = dist[18 * Np + n];
f18 = dist[17 * Np + n];
/* Ex = (f1 - f2) * rlx *
4.0; //NOTE the unit of electric field here is V/lu
Ey = (f3 - f4) * rlx *
4.0; //factor 4.0 is D3Q7 lattice squared speed of sound
Ez = (f5 - f6) * rlx * 4.0;
*/
Ex = (f1 - f2 + 0.5*(f7 - f8 + f9 - f10 + f11 - f12 + f13 - f14))*4.0; //NOTE the unit of electric field here is V/lu
Ey = (f3 - f4 + 0.5*(f7 - f8 - f9 + f10 + f15 - f16 + f17 - f18))*4.0;
Ez = (f5 - f6 + 0.5*(f11 - f12 - f13 + f14 + f15 - f16 - f17 + f18))*4.0;
ElectricField[n + 0 * Np] = Ex;
ElectricField[n + 1 * Np] = Ey;
ElectricField[n + 2 * Np] = Ez;
sum_q = f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15+f16+f17+f18;
error = 8.0*(sum_q - f0) + rho_e;
Error[n] = error;
psi = 2.0*(f0*(1.0 - rlx) + rlx*(sum_q + 0.125*rho_e));
idx = Map[n];
Psi[idx] = psi;
// q = 0
dist[n] = W0*psi;//
// q = 1
dist[1 * Np + n] = W1*psi;//f1 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 2
dist[2 * Np + n] = W1*psi;//f2 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 3
dist[3 * Np + n] = W1*psi;//f3 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 4
dist[4 * Np + n] = W1*psi;//f4 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 5
dist[5 * Np + n] = W1*psi;//f5 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 6
dist[6 * Np + n] = W1*psi;//f6 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
dist[7 * Np + n] = W2*psi;//f7 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[8 * Np + n] = W2*psi;//f8* (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[9 * Np + n] = W2*psi;//f9 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[10 * Np + n] = W2*psi;//f10 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[11 * Np + n] = W2*psi;//f11 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[12 * Np + n] = W2*psi;//f12 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[13 * Np + n] = W2*psi;//f13 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[14 * Np + n] = W2*psi;//f14 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[15 * Np + n] = W2*psi;//f15 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[16 * Np + n] = W2*psi;//f16 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[17 * Np + n] = W2*psi;//f17 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[18 * Np + n] = W2*psi;//f18 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
//........................................................................
}
}
extern "C" void ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_z(int *list, double *dist, double Vin, int count, int Np) {
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int nread, nr5;
for (int idx = 0; idx < count; idx++) {
int n = list[idx];
dist[6 * Np + n] = W1*Vin;
dist[12 * Np + n] = W2*Vin;
dist[13 * Np + n] = W2*Vin;
dist[16 * Np + n] = W2*Vin;
dist[17 * Np + n] = W2*Vin;
}
}
extern "C" void ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_Z(int *list,
double *dist,
double Vout,
int count, int Np) {
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
for (int idx = 0; idx < count; idx++) {
int n = list[idx];
dist[5 * Np + n] = W1*Vout;
dist[11 * Np + n] = W2*Vout;
dist[14 * Np + n] = W2*Vout;
dist[15 * Np + n] = W2*Vout;
dist[18 * Np + n] = W2*Vout;
}
}
extern "C" void ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_z(int *d_neighborList,
int *list,
double *dist,
double Vin, int count,
int Np) {
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int nr5, nr11, nr14, nr15, nr18;
for (int idx = 0; idx < count; idx++) {
int n = list[idx];
// Unknown distributions
nr5 = d_neighborList[n + 4 * Np];
nr11 = d_neighborList[n + 10 * Np];
nr15 = d_neighborList[n + 14 * Np];
nr14 = d_neighborList[n + 13 * Np];
nr18 = d_neighborList[n + 17 * Np];
dist[nr5] = W1*Vin;
dist[nr11] = W2*Vin;
dist[nr15] = W2*Vin;
dist[nr14] = W2*Vin;
dist[nr18] = W2*Vin;
}
}
extern "C" void ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_Z(int *d_neighborList, int *list, double *dist, double Vout, int count, int Np) {
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int nr6, nr12, nr13, nr16, nr17;
for (int idx = 0; idx < count; idx++) {
int n = list[idx];
// unknown distributions
nr6 = d_neighborList[n + 5 * Np];
nr12 = d_neighborList[n + 11 * Np];
nr16 = d_neighborList[n + 15 * Np];
nr17 = d_neighborList[n + 16 * Np];
nr13 = d_neighborList[n + 12 * Np];
dist[nr6] = W1*Vout;
dist[nr12] = W2*Vout;
dist[nr16] = W2*Vout;
dist[nr17] = W2*Vout;
dist[nr13] = W2*Vout;
}
}
extern "C" void ScaLBL_D3Q19_Poisson_Init(int *Map, double *dist, double *Psi,
int start, int finish, int Np) {
int n;
int ijk;
double W0 = 0.5;
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
for (n = start; n < finish; n++) {
ijk = Map[n];
dist[0 * Np + n] = W0 * Psi[ijk];//3333333333333333* Psi[ijk];
dist[1 * Np + n] = W1 * Psi[ijk];
dist[2 * Np + n] = W1 * Psi[ijk];
dist[3 * Np + n] = W1 * Psi[ijk];
dist[4 * Np + n] = W1 * Psi[ijk];
dist[5 * Np + n] = W1 * Psi[ijk];
dist[6 * Np + n] = W1 * Psi[ijk];
dist[7 * Np + n] = W2* Psi[ijk];
dist[8 * Np + n] = W2* Psi[ijk];
dist[9 * Np + n] = W2* Psi[ijk];
dist[10 * Np + n] = W2* Psi[ijk];
dist[11 * Np + n] = W2* Psi[ijk];
dist[12 * Np + n] = W2* Psi[ijk];
dist[13 * Np + n] = W2* Psi[ijk];
dist[14 * Np + n] = W2* Psi[ijk];
dist[15 * Np + n] = W2* Psi[ijk];
dist[16 * Np + n] = W2* Psi[ijk];
dist[17 * Np + n] = W2* Psi[ijk];
dist[18 * Np + n] = W2* Psi[ijk];
}
}

View File

@@ -4,7 +4,7 @@ extern "C" void ScaLBL_D3Q19_AAeven_StokesMRT(
double *dist, double *Velocity, double *ChargeDensity,
double *ElectricField, double rlx_setA, double rlx_setB, double Gx,
double Gy, double Gz, double rho0, double den_scale, double h,
double time_conv, int start, int finish, int Np) {
double time_conv, bool UseSlippingVelBC, int start, int finish, int Np) {
double fq;
// conserved momemnts
double rho, jx, jy, jz;
@@ -38,13 +38,11 @@ extern "C" void ScaLBL_D3Q19_AAeven_StokesMRT(
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
//compute total body force, including input body force (Gx,Gy,Gz)
Fx =
Gx +
rhoE * Ex * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale; //the extra factors at the end necessarily convert unit from phys to LB
Fy = Gy + rhoE * Ey * (time_conv * time_conv) / (h * h * 1.0e-12) /
Fx = (UseSlippingVelBC==1) ? Gx : Gx + rhoE * Ex * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale; //the extra factors at the end necessarily convert unit from phys to LB
Fy = (UseSlippingVelBC==1) ? Gy : Gy + rhoE * Ey * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
Fz = Gz + rhoE * Ez * (time_conv * time_conv) / (h * h * 1.0e-12) /
Fz = (UseSlippingVelBC==1) ? Gz : Gz + rhoE * Ez * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
// q=0
@@ -479,7 +477,7 @@ extern "C" void ScaLBL_D3Q19_AAodd_StokesMRT(
int *neighborList, double *dist, double *Velocity, double *ChargeDensity,
double *ElectricField, double rlx_setA, double rlx_setB, double Gx,
double Gy, double Gz, double rho0, double den_scale, double h,
double time_conv, int start, int finish, int Np) {
double time_conv, bool UseSlippingVelBC, int start, int finish, int Np) {
double fq;
// conserved momemnts
double rho, jx, jy, jz;
@@ -513,12 +511,21 @@ extern "C" void ScaLBL_D3Q19_AAodd_StokesMRT(
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
//compute total body force, including input body force (Gx,Gy,Gz)
Fx = Gx + rhoE * Ex * (time_conv * time_conv) / (h * h * 1.0e-12) /
//Fx = Gx + rhoE * Ex * (time_conv * time_conv) / (h * h * 1.0e-12) /
// den_scale; //the extra factors at the end necessarily convert unit from phys to LB
//Fy = Gy + rhoE * Ey * (time_conv * time_conv) / (h * h * 1.0e-12) /
// den_scale;
//Fz = Gz + rhoE * Ez * (time_conv * time_conv) / (h * h * 1.0e-12) /
// den_scale;
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and body force induced by external efectric field is reduced to slipping velocity BC.
Fx = (UseSlippingVelBC==1) ? Gx : Gx + rhoE * Ex * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale; //the extra factors at the end necessarily convert unit from phys to LB
Fy = (UseSlippingVelBC==1) ? Gy : Gy + rhoE * Ey * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
Fy = Gy + rhoE * Ey * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
Fz = Gz + rhoE * Ez * (time_conv * time_conv) / (h * h * 1.0e-12) /
Fz = (UseSlippingVelBC==1) ? Gz : Gz + rhoE * Ez * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
// q=0

View File

@@ -1,7 +1,7 @@
#include <stdio.h>
#define NBLOCKS 1024
#define NTHREADS 256
#define NTHREADS 512
__global__ void dvc_ScaLBL_D3Q19_AAeven_BGK(double *dist, int start, int finish, int Np, double rlx, double Fx, double Fy, double Fz){
int n;

View File

@@ -290,7 +290,7 @@ __global__ void dvc_ScaLBL_D3Q19_Swap_Compact(int *neighborList, double *distev
//__launch_bounds__(512,4)
__global__ void
dvc_ScaLBL_AAodd_Compact(char * ID, int *d_neighborList, double *dist, int Np) {
dvc_ScaLBL_AAodd_Compact(int *d_neighborList, double *dist, int Np) {
int n;
double f0,f1,f2,f3,f4,f5,f6,f7,f8,f9;
@@ -1321,7 +1321,7 @@ dvc_ScaLBL_AAeven_MRT(double *dist, int start, int finish, int Np, double rlx_se
//__launch_bounds__(512,4)
__global__ void dvc_ScaLBL_AAeven_Compact(char * ID, double *dist, int Np) {
__global__ void dvc_ScaLBL_AAeven_Compact( double *dist, int Np) {
int n;
double f0,f1,f2,f3,f4,f5,f6,f7,f8,f9;
@@ -2390,18 +2390,18 @@ extern "C" void ScaLBL_D3Q19_Swap_Compact(int *neighborList, double *disteven, d
}
}
extern "C" void ScaLBL_D3Q19_AAeven_Compact(char * ID, double *d_dist, int Np) {
extern "C" void ScaLBL_D3Q19_AAeven_Compact( double *d_dist, int Np) {
cudaFuncSetCacheConfig(dvc_ScaLBL_AAeven_Compact, cudaFuncCachePreferL1);
dvc_ScaLBL_AAeven_Compact<<<NBLOCKS,NTHREADS>>>(ID, d_dist, Np);
dvc_ScaLBL_AAeven_Compact<<<NBLOCKS,NTHREADS>>>(d_dist, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_Init: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q19_AAodd_Compact(char * ID, int *d_neighborList, double *d_dist, int Np) {
extern "C" void ScaLBL_D3Q19_AAodd_Compact( int *d_neighborList, double *d_dist, int Np) {
cudaFuncSetCacheConfig(dvc_ScaLBL_AAodd_Compact, cudaFuncCachePreferL1);
dvc_ScaLBL_AAodd_Compact<<<NBLOCKS,NTHREADS>>>(ID,d_neighborList, d_dist,Np);
dvc_ScaLBL_AAodd_Compact<<<NBLOCKS,NTHREADS>>>(d_neighborList, d_dist,Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_Init: %s \n",cudaGetErrorString(err));

View File

@@ -6,6 +6,16 @@
#define NTHREADS 256
#define CHECK_ERROR(KERNEL) \
do { \
auto err = cudaGetLastError(); \
if ( cudaSuccess != err ){ \
auto errString = cudaGetErrorString(err); \
printf("error in %s (kernel): %s \n",KERNEL,errString); \
} \
} while(0)
__global__ void dvc_ScaLBL_Solid_Dirichlet_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count)
{
@@ -38,6 +48,28 @@ __global__ void dvc_ScaLBL_Solid_Neumann_D3Q7(double *dist, double *BoundaryValu
}
}
__global__ void dvc_ScaLBL_Solid_DirichletAndNeumann_D3Q7(double *dist, double *BoundaryValue,int *BoundaryLabel, int *BounceBackDist_list, int *BounceBackSolid_list, int count)
{
int idx;
int iq,ib;
double value_b,value_b_label,value_q;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
iq = BounceBackDist_list[idx];
ib = BounceBackSolid_list[idx];
value_b = BoundaryValue[ib];//get boundary value from a solid site
value_b_label = BoundaryLabel[ib];//get boundary label (i.e. type of BC) from a solid site
value_q = dist[iq];
if (value_b_label==1){//Dirichlet BC
dist[iq] = -1.0*value_q + value_b*0.25;//NOTE 0.25 is the speed of sound for D3Q7 lattice
}
if (value_b_label==2){//Neumann BC
dist[iq] = value_q + value_b;
}
}
}
__global__ void dvc_ScaLBL_Solid_SlippingVelocityBC_D3Q19(double *dist, double *zeta_potential, double *ElectricField, double *SolidGrad,
double epsilon_LB, double tau, double rho0,double den_scale, double h, double time_conv,
int *BounceBackDist_list, int *BounceBackSolid_list, int *FluidBoundary_list,
@@ -718,19 +750,19 @@ __global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z(int *d_neighbor
extern "C" void ScaLBL_Solid_Dirichlet_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_Dirichlet_D3Q7<<<GRID,512>>>(dist, BoundaryValue, BounceBackDist_list, BounceBackSolid_list, count);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_Solid_Dirichlet_D3Q7 (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_Solid_Dirichlet_D3Q7");
}
extern "C" void ScaLBL_Solid_Neumann_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_Neumann_D3Q7<<<GRID,512>>>(dist, BoundaryValue, BounceBackDist_list, BounceBackSolid_list, count);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_Solid_Neumann_D3Q7 (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_Solid_Neumann_D3Q7");
}
extern "C" void ScaLBL_Solid_DirichletAndNeumann_D3Q7(double *dist, double *BoundaryValue,int *BoundaryLabel, int *BounceBackDist_list, int *BounceBackSolid_list, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_DirichletAndNeumann_D3Q7<<<GRID,512>>>(dist, BoundaryValue, BoundaryLabel, BounceBackDist_list, BounceBackSolid_list, count);
CHECK_ERROR("ScaLBL_Solid_DirichletAndNeumann_D3Q7");
}
extern "C" void ScaLBL_Solid_SlippingVelocityBC_D3Q19(double *dist, double *zeta_potential, double *ElectricField, double *SolidGrad,
@@ -744,211 +776,142 @@ extern "C" void ScaLBL_Solid_SlippingVelocityBC_D3Q19(double *dist, double *zeta
BounceBackDist_list, BounceBackSolid_list, FluidBoundary_list,
lattice_weight, lattice_cx, lattice_cy, lattice_cz,
count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_Solid_SlippingVelocityBC_D3Q19 (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_Solid_SlippingVelocityBC_D3Q19");
}
extern "C" void ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z(int *list, double *dist, double Vin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z<<<GRID,512>>>(list, dist, Vin, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z(int *list, double *dist, double Vout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z<<<GRID,512>>>(list, dist, Vout, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z(int *d_neighborList, int *list, double *dist, double Vin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z<<<GRID,512>>>(d_neighborList, list, dist, Vin, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z(int *d_neighborList, int *list, double *dist, double Vout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, Vout, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z");
}
extern "C" void ScaLBL_Poisson_D3Q7_BC_z(int *list, int *Map, double *Psi, double Vin, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Poisson_D3Q7_BC_z<<<GRID,512>>>(list, Map, Psi, Vin, count);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_Poisson_D3Q7_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_Poisson_D3Q7_BC_z");
}
extern "C" void ScaLBL_Poisson_D3Q7_BC_Z(int *list, int *Map, double *Psi, double Vout, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Poisson_D3Q7_BC_Z<<<GRID,512>>>(list, Map, Psi, Vout, count);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_Poisson_D3Q7_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_Poisson_D3Q7_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z(int *list, double *dist, double Cin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z<<<GRID,512>>>(list, dist, Cin, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z(int *list, double *dist, double Cout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z<<<GRID,512>>>(list, dist, Cout, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z(int *d_neighborList, int *list, double *dist, double Cin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z<<<GRID,512>>>(d_neighborList, list, dist, Cin, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z(int *d_neighborList, int *list, double *dist, double Cout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, Cout, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z");
}
//------------Diff-----------------
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_Z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_Z");
}
//----------DiffAdvc-------------
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_Z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_Z");
}
//----------DiffAdvcElec-------------
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, ElectricField_Z, Di, zi, Vt, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_Z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, ElectricField_Z, Di, zi, Vt, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, ElectricField_Z, Di, zi, Vt, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, ElectricField_Z, Di, zi, Vt, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z");
}
//-------------------------------

View File

@@ -3,7 +3,207 @@
//#include <cuda_profiler_api.h>
#define NBLOCKS 1024
#define NTHREADS 256
#define NTHREADS 512
extern "C" void Membrane_D3Q19_Unpack(int q, int *list, int *links, int start, int linkCount,
double *recvbuf, double *dist, int N) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, link;
for (link=0; link<linkCount; link++){
idx = links[start+link];
// Get the value from the list -- note that n is the index is from the send (non-local) process
n = list[start + idx];
// unpack the distribution to the proper location
if (!(n < 0))
dist[q * N + n] = recvbuf[start + idx];
}
}
extern "C" void Membrane_D3Q19_Transport(int q, int *list, int *links, double *coef, int start, int offset,
int linkCount, double *recvbuf, double *dist, int N){
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, link;
double alpha;
for (link=offset; link<linkCount; link++){
idx = list[start+link];
// Get the value from the list -- note that n is the index is from the send (non-local) process
n = list[start + idx];
alpha = coef[start + idx];
// unpack the distribution to the proper location
if (!(n < 0))
dist[q * N + n] = alpha*recvbuf[start + idx];
}
}
__global__ void dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef(int *membrane, int *Map, double *Distance, double *Psi, double *coef,
double Threshold, double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int memLinks, int Nx, int Ny, int Nz, int Np){
int link,iq,ip,nq,np,nqm,npm;
double aq, ap, membranePotential;
/* Interior Links */
int S = memLinks/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
link = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (link < memLinks) {
// inside //outside
aq = MassFractionIn; ap = MassFractionOut;
iq = membrane[2*link]; ip = membrane[2*link+1];
nq = iq%Np; np = ip%Np;
nqm = Map[nq]; npm = Map[np]; // strided layout
/* membrane potential for this link */
membranePotential = Psi[nqm] - Psi[npm];
if (membranePotential > Threshold){
aq = ThresholdMassFractionIn; ap = ThresholdMassFractionOut;
}
/* Save the mass transfer coefficients */
coef[2*link] = aq; coef[2*link+1] = ap;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(
const int Cqx, const int Cqy, int const Cqz,
int *Map, double *Distance, double *Psi, double Threshold,
double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int *d3q7_recvlist, int *d3q7_linkList, double *coef, int start, int nlinks, int count,
const int N, const int Nx, const int Ny, const int Nz) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, nqm, npm, label, i, j, k;
double distanceLocal, distanceNonlocal;
double psiLocal, psiNonlocal, membranePotential;
double ap,aq; // coefficient
/* second enforce custom rule for membrane links */
int S = (count-nlinks)/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
idx = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (idx < count) {
n = d3q7_recvlist[idx];
label = d3q7_linkList[idx];
ap = 1.0; // regular streaming rule
aq = 1.0;
if (label > 0 && !(n < 0)){
nqm = Map[n];
distanceLocal = Distance[nqm];
psiLocal = Psi[nqm];
// Get the 3-D indices from the send process
k = nqm/(Nx*Ny); j = (nqm-Nx*Ny*k)/Nx; i = nqm-Nx*Ny*k-Nx*j;
// Streaming link the non-local distribution
i -= Cqx; j -= Cqy; k -= Cqz;
npm = k*Nx*Ny + j*Nx + i;
distanceNonlocal = Distance[npm];
psiNonlocal = Psi[npm];
membranePotential = psiLocal - psiNonlocal;
aq = MassFractionIn;
ap = MassFractionOut;
/* link is inside membrane */
if (distanceLocal > 0.0){
if (membranePotential < Threshold*(-1.0)){
ap = MassFractionIn;
aq = MassFractionOut;
}
else {
ap = ThresholdMassFractionIn;
aq = ThresholdMassFractionOut;
}
}
else if (membranePotential > Threshold){
aq = ThresholdMassFractionIn;
ap = ThresholdMassFractionOut;
}
}
coef[2*idx]=aq;
coef[2*idx+1]=ap;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Membrane_Unpack(int q,
int *d3q7_recvlist, double *recvbuf, int count,
double *dist, int N, double *coef) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, link;
double fq,fp,fqq,ap,aq; // coefficient
/* second enforce custom rule for membrane links */
int S = count/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
idx = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (idx < count){
n = d3q7_recvlist[idx];
// update link based on mass transfer coefficients
if (!(n < 0)){
aq = coef[2*idx];
ap = coef[2*idx+1];
fq = dist[q * N + n];
fp = recvbuf[idx];
fqq = (1-aq)*fq+ap*fp;
dist[q * N + n] = fqq;
}
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Membrane_IonTransport(int *membrane, double *coef,
double *dist, double *Den, int memLinks, int Np){
int link,iq,ip,nq,np;
double aq, ap, fq, fp, fqq, fpp, Cq, Cp;
int S = memLinks/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
link = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (link < memLinks){
// inside //outside
aq = coef[2*link]; ap = coef[2*link+1];
iq = membrane[2*link]; ip = membrane[2*link+1];
nq = iq%Np; np = ip%Np;
fq = dist[iq]; fp = dist[ip];
fqq = (1-aq)*fq+ap*fp; fpp = (1-ap)*fp+aq*fq;
Cq = Den[nq]; Cp = Den[np];
Cq += fqq - fq; Cp += fpp - fp;
Den[nq] = Cq; Den[np] = Cp;
dist[iq] = fqq; dist[ip] = fpp;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_IonConcentration(int *neighborList, double *dist, double *Den, int start, int finish, int Np){
int n,nread;
@@ -106,6 +306,7 @@ __global__ void dvc_ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist, doub
double Ex,Ey,Ez;//electrical field
double flux_diffusive_x,flux_diffusive_y,flux_diffusive_z;
double f0,f1,f2,f3,f4,f5,f6;
double X,Y,Z,factor_x,factor_y,factor_z;
int nr1,nr2,nr3,nr4,nr5,nr6;
int S = Np/NBLOCKS/NTHREADS + 1;
@@ -114,80 +315,96 @@ __global__ void dvc_ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist, doub
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ci=Den[n];
Ex=ElectricField[n+0*Np];
Ey=ElectricField[n+1*Np];
Ez=ElectricField[n+2*Np];
ux=Velocity[n+0*Np];
uy=Velocity[n+1*Np];
uz=Velocity[n+2*Np];
uEPx=zi*Di/Vt*Ex;
uEPy=zi*Di/Vt*Ey;
uEPz=zi*Di/Vt*Ez;
//Load data
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
// q=2
nr2 = neighborList[n+Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n+2*Np]; // neighbor 4
f3 = dist[nr3];
// q=4
nr4 = neighborList[n+3*Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n+4*Np];
f5 = dist[nr5];
// q=6
nr6 = neighborList[n+5*Np];
f6 = dist[nr6];
// compute diffusive flux
flux_diffusive_x = (1.0-0.5*rlx)*((f1-f2)-ux*Ci);
flux_diffusive_y = (1.0-0.5*rlx)*((f3-f4)-uy*Ci);
flux_diffusive_z = (1.0-0.5*rlx)*((f5-f6)-uz*Ci);
FluxDiffusive[n+0*Np] = flux_diffusive_x;
FluxDiffusive[n+1*Np] = flux_diffusive_y;
FluxDiffusive[n+2*Np] = flux_diffusive_z;
FluxAdvective[n+0*Np] = ux*Ci;
FluxAdvective[n+1*Np] = uy*Ci;
FluxAdvective[n+2*Np] = uz*Ci;
FluxElectrical[n+0*Np] = uEPx*Ci;
FluxElectrical[n+1*Np] = uEPy*Ci;
FluxElectrical[n+2*Np] = uEPz*Ci;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
// q=2
nr2 = neighborList[n + Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n + 2 * Np]; // neighbor 4
f3 = dist[nr3];
// q=4
nr4 = neighborList[n + 3 * Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n + 4 * Np];
f5 = dist[nr5];
// q=6
nr6 = neighborList[n + 5 * Np];
f6 = dist[nr6];
// q=0
dist[n] = f0*(1.0-rlx)+rlx*0.25*Ci;
//dist[n] = f0*(1.0-rlx)+rlx*0.25*Ci*(1.0 - 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// compute diffusive flux
Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
Den[n] = Ci;
// q = 1
dist[nr2] = f1*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(ux+uEPx));
//dist[nr2] = f1*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(ux+uEPx)+8.0*(ux+uEPx)*(ux+uEPx)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
/* use logistic function to prevent negative distributions*/
X = 4.0 * (ux + uEPx);
Y = 4.0 * (uy + uEPy);
Z = 4.0 * (uz + uEPz);
factor_x = X / sqrt(1 + X*X);
factor_y = Y / sqrt(1 + Y*Y);
factor_z = Z / sqrt(1 + Z*Z);
// q=2
dist[nr1] = f2*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(ux+uEPx));
//dist[nr1] = f2*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(ux+uEPx)+8.0*(ux+uEPx)*(ux+uEPx)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 3
dist[nr4] = f3*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uy+uEPy));
//dist[nr4] = f3*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uy+uEPy)+8.0*(uy+uEPy)*(uy+uEPy)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 1
dist[nr2] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
//f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// q = 4
dist[nr3] = f4*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uy+uEPy));
//dist[nr3] = f4*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uy+uEPy)+8.0*(uy+uEPy)*(uy+uEPy)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 5
dist[nr6] = f5*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uz+uEPz));
//dist[nr6] = f5*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uz+uEPz)+8.0*(uz+uEPz)*(uz+uEPz)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q=2
dist[nr1] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
//f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// q = 3
dist[nr4] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y );
//f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// q = 4
dist[nr3] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
//f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// q = 5
dist[nr6] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
//f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// q = 6
dist[nr5] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
// q = 6
dist[nr5] = f6*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uz+uEPz));
//dist[nr5] = f6*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uz+uEPz)+8.0*(uz+uEPz)*(uz+uEPz)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
}
}
}
@@ -201,6 +418,7 @@ __global__ void dvc_ScaLBL_D3Q7_AAeven_Ion(double *dist, double *Den, double *F
double Ex,Ey,Ez;//electrical field
double flux_diffusive_x,flux_diffusive_y,flux_diffusive_z;
double f0,f1,f2,f3,f4,f5,f6;
double X,Y,Z,factor_x,factor_y,factor_z;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
@@ -208,67 +426,83 @@ __global__ void dvc_ScaLBL_D3Q7_AAeven_Ion(double *dist, double *Den, double *F
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ci=Den[n];
Ex=ElectricField[n+0*Np];
Ey=ElectricField[n+1*Np];
Ez=ElectricField[n+2*Np];
ux=Velocity[n+0*Np];
uy=Velocity[n+1*Np];
uz=Velocity[n+2*Np];
uEPx=zi*Di/Vt*Ex;
uEPy=zi*Di/Vt*Ey;
uEPz=zi*Di/Vt*Ez;
//Load data
//Ci = Den[n];
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
f6 = dist[5*Np+n];
// compute diffusive flux
flux_diffusive_x = (1.0-0.5*rlx)*((f1-f2)-ux*Ci);
flux_diffusive_y = (1.0-0.5*rlx)*((f3-f4)-uy*Ci);
flux_diffusive_z = (1.0-0.5*rlx)*((f5-f6)-uz*Ci);
FluxDiffusive[n+0*Np] = flux_diffusive_x;
FluxDiffusive[n+1*Np] = flux_diffusive_y;
FluxDiffusive[n+2*Np] = flux_diffusive_z;
FluxAdvective[n+0*Np] = ux*Ci;
FluxAdvective[n+1*Np] = uy*Ci;
FluxAdvective[n+2*Np] = uz*Ci;
FluxElectrical[n+0*Np] = uEPx*Ci;
FluxElectrical[n+1*Np] = uEPy*Ci;
FluxElectrical[n+2*Np] = uEPz*Ci;
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
// q=0
dist[n] = f0*(1.0-rlx)+rlx*0.25*Ci;
//dist[n] = f0*(1.0-rlx)+rlx*0.25*Ci*(1.0 - 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// compute diffusive flux
Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
X = 4.0 * (ux + uEPx);
Y = 4.0 * (uy + uEPy);
Z = 4.0 * (uz + uEPz);
factor_x = X / sqrt(1 + X*X);
factor_y = Y / sqrt(1 + Y*Y);
factor_z = Z / sqrt(1 + Z*Z);
// q = 1
dist[1*Np+n] = f1*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(ux+uEPx));
//dist[1*Np+n] = f1*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(ux+uEPx)+8.0*(ux+uEPx)*(ux+uEPx)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q=2
dist[2*Np+n] = f2*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(ux+uEPx));
//dist[2*Np+n] = f2*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(ux+uEPx)+8.0*(ux+uEPx)*(ux+uEPx)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 1
dist[1 * Np + n] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
//f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// q = 3
dist[3*Np+n] = f3*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uy+uEPy));
//dist[3*Np+n] = f3*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uy+uEPy)+8.0*(uy+uEPy)*(uy+uEPy)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q=2
dist[2 * Np + n] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
//f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// q = 4
dist[4*Np+n] = f4*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uy+uEPy));
//dist[4*Np+n] = f4*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uy+uEPy)+8.0*(uy+uEPy)*(uy+uEPy)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 3
dist[3 * Np + n] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y);
//f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// q = 5
dist[5*Np+n] = f5*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uz+uEPz));
//dist[5*Np+n] = f5*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uz+uEPz)+8.0*(uz+uEPz)*(uz+uEPz)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 4
dist[4 * Np + n] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
//f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// q = 6
dist[6*Np+n] = f6*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uz+uEPz));
//dist[6*Np+n] = f6*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uz+uEPz)+8.0*(uz+uEPz)*(uz+uEPz)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 5
dist[5 * Np + n] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
//f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// q = 6
dist[6 * Np + n] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
//f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
}
}
}
@@ -314,7 +548,7 @@ __global__ void dvc_ScaLBL_D3Q7_Ion_Init_FromFile(double *dist, double *Den, in
}
}
__global__ void dvc_ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity, int IonValence, int ion_component, int start, int finish, int Np){
__global__ void dvc_ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity, double IonValence, int ion_component, int start, int finish, int Np){
int n;
double Ci;//ion concentration of species i
@@ -327,13 +561,278 @@ __global__ void dvc_ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDe
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
Ci = Den[n+ion_component*Np];
CD = ChargeDensity[n];
if (ion_component == 0) CD=0.0;
CD_tmp = F*IonValence*Ci;
ChargeDensity[n] = CD*(ion_component>0) + CD_tmp;
ChargeDensity[n] = CD + CD_tmp;
// Ci = Den[n+ion_component*Np];
// CD = ChargeDensity[n];
// CD_tmp = F*IonValence*Ci;
// ChargeDensity[n] = CD*(ion_component>0) + CD_tmp;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_v0(int *neighborList, double *dist,
double *Den, double *FluxDiffusive,
double *FluxAdvective,
double *FluxElectrical, double *Velocity,
double *ElectricField, double Di, int zi,
double rlx, double Vt, int start,
int finish, int Np) {
int n;
double Ci;
double ux, uy, uz;
double uEPx, uEPy, uEPz; //electrochemical induced velocity
double Ex, Ey, Ez; //electrical field
double flux_diffusive_x, flux_diffusive_y, flux_diffusive_z;
double f0, f1, f2, f3, f4, f5, f6;
//double X,Y,Z,factor_x, factor_y, factor_z;
int nr1, nr2, nr3, nr4, nr5, nr6;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ci = Den[n];
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
// q=2
nr2 = neighborList[n + Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n + 2 * Np]; // neighbor 4
f3 = dist[nr3];
// q=4
nr4 = neighborList[n + 3 * Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n + 4 * Np];
f5 = dist[nr5];
// q=6
nr6 = neighborList[n + 5 * Np];
f6 = dist[nr6];
// compute diffusive flux
//Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
//Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
//X = 4.0 * (ux + uEPx);
//Y = 4.0 * (uy + uEPy);
//Z = 4.0 * (uz + uEPz);
//factor_x = X / sqrt(1 + X*X);
//factor_y = Y / sqrt(1 + Y*Y);
//factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[nr2] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
// q=2
dist[nr1] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
// q = 3
dist[nr4] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y );
// q = 4
dist[nr3] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
// q = 5
dist[nr6] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
// q = 6
dist[nr5] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
// f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_v0(
double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective,
double *FluxElectrical, double *Velocity, double *ElectricField, double Di,
int zi, double rlx, double Vt, int start, int finish, int Np) {
int n;
double Ci;
double ux, uy, uz;
double uEPx, uEPy, uEPz; //electrochemical induced velocity
double Ex, Ey, Ez; //electrical field
double flux_diffusive_x, flux_diffusive_y, flux_diffusive_z;
double f0, f1, f2, f3, f4, f5, f6;
//double X,Y,Z, factor_x, factor_y, factor_z;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ci = Den[n];
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
// compute diffusive flux
//Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
//Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
//X = 4.0 * (ux + uEPx);
//Y = 4.0 * (uy + uEPy);
//Z = 4.0 * (uz + uEPz);
//factor_x = X / sqrt(1 + X*X);
//factor_y = Y / sqrt(1 + Y*Y);
//factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[1 * Np + n] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
// q=2
dist[2 * Np + n] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
// q = 3
dist[3 * Np + n] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y);
// q = 4
dist[4 * Np + n] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
// q = 5
dist[5 * Np + n] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
// q = 6
dist[6 * Np + n] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
// f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_v0(
double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective,
double *FluxElectrical, double *Velocity, double *ElectricField, double Di,
int zi, double rlx, double Vt, int start, int finish, int Np) {
dvc_ScaLBL_D3Q7_AAeven_Ion_v0<<<NBLOCKS,NTHREADS >>>(dist,
Den, FluxDiffusive, FluxAdvective,
FluxElectrical, Velocity,
ElectricField, Di, zi,
rlx, Vt, start, finish, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("cuda error in dvc_ScaLBL_D3Q7_AAeven_Ion_v0: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_v0(int *neighborList, double *dist,
double *Den, double *FluxDiffusive,
double *FluxAdvective,
double *FluxElectrical, double *Velocity,
double *ElectricField, double Di, int zi,
double rlx, double Vt, int start,
int finish, int Np) {
dvc_ScaLBL_D3Q7_AAodd_Ion_v0<<<NBLOCKS,NTHREADS >>>(neighborList, dist,
Den, FluxDiffusive, FluxAdvective,
FluxElectrical, Velocity,
ElectricField, Di, zi,
rlx, Vt, start,
finish, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("cuda error in dvc_ScaLBL_D3Q7_AAodd_Ion_v0: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_IonConcentration(int *neighborList, double *dist, double *Den, int start, int finish, int Np){
@@ -408,7 +907,7 @@ extern "C" void ScaLBL_D3Q7_Ion_Init_FromFile(double *dist, double *Den, int Np)
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity, int IonValence, int ion_component, int start, int finish, int Np){
extern "C" void ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity, double IonValence, int ion_component, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_Ion_ChargeDensity<<<NBLOCKS,NTHREADS >>>(Den,ChargeDensity,IonValence,ion_component,start,finish,Np);
@@ -419,3 +918,65 @@ extern "C" void ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Membrane_AssignLinkCoef(int *membrane, int *Map, double *Distance, double *Psi, double *coef,
double Threshold, double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int memLinks, int Nx, int Ny, int Nz, int Np){
dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef<<<NBLOCKS,NTHREADS >>>(membrane, Map, Distance, Psi, coef,
Threshold, MassFractionIn, MassFractionOut, ThresholdMassFractionIn, ThresholdMassFractionOut,
memLinks, Nx, Ny, Nz, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(
const int Cqx, const int Cqy, int const Cqz,
int *Map, double *Distance, double *Psi, double Threshold,
double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int *d3q7_recvlist, int *d3q7_linkList, double *coef, int start, int nlinks, int count,
const int N, const int Nx, const int Ny, const int Nz) {
int GRID = count / NTHREADS + 1;
dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo<<<GRID,NTHREADS >>>(
Cqx, Cqy, Cqz, Map, Distance, Psi, Threshold,
MassFractionIn, MassFractionOut, ThresholdMassFractionIn, ThresholdMassFractionOut,
d3q7_recvlist, d3q7_linkList, coef, start, nlinks, count, N, Nx, Ny, Nz);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_Membrane_Unpack(int q,
int *d3q7_recvlist, double *recvbuf, int count,
double *dist, int N, double *coef){
int GRID = count / NTHREADS + 1;
dvc_ScaLBL_D3Q7_Membrane_Unpack<<<GRID,NTHREADS >>>(q, d3q7_recvlist, recvbuf,count,
dist, N, coef);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q7_Membrane_Unpack: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_Membrane_IonTransport(int *membrane, double *coef,
double *dist, double *Den, int memLinks, int Np){
dvc_ScaLBL_D3Q7_Membrane_IonTransport<<<NBLOCKS,NTHREADS >>>(membrane, coef, dist, Den, memLinks, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q7_Membrane_IonTransport: %s \n",cudaGetErrorString(err));
}
}

View File

@@ -4,8 +4,8 @@
//*************************************************************************
#include <cuda.h>
#define NBLOCKS 560
#define NTHREADS 128
#define NBLOCKS 1024
#define NTHREADS 512
__global__ void INITIALIZE(char *ID, double *f_even, double *f_odd, int Nx, int Ny, int Nz)
{

View File

@@ -3,7 +3,7 @@
//#include <cuda_profiler_api.h>
#define NBLOCKS 1024
#define NTHREADS 256
#define NTHREADS 512
__global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson_ElectricPotential(int *neighborList,int *Map, double *dist, double *Psi, int start, int finish, int Np){
int n;
@@ -104,7 +104,7 @@ __global__ void dvc_ScaLBL_D3Q7_AAeven_Poisson_ElectricPotential(int *Map, doub
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map, double *dist, double *Den_charge, double *Psi, double *ElectricField, double tau, double epsilon_LB,int start, int finish, int Np){
__global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map, double *dist, double *Den_charge, double *Psi, double *ElectricField, double tau, double epsilon_LB,bool UseSlippingVelBC,int start, int finish, int Np){
int n;
double psi;//electric potential
@@ -122,8 +122,9 @@ __global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map, doub
if (n<finish) {
//Load data
rho_e = Den_charge[n];
rho_e = rho_e/epsilon_LB;
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and thus the net space charge density is zero.
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
idx=Map[n];
psi = Psi[idx];
@@ -184,7 +185,7 @@ __global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map, doub
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Poisson(int *Map, double *dist, double *Den_charge, double *Psi, double *ElectricField, double tau, double epsilon_LB,int start, int finish, int Np){
__global__ void dvc_ScaLBL_D3Q7_AAeven_Poisson(int *Map, double *dist, double *Den_charge, double *Psi, double *ElectricField, double tau, double epsilon_LB,bool UseSlippingVelBC,int start, int finish, int Np){
int n;
double psi;//electric potential
@@ -201,8 +202,9 @@ __global__ void dvc_ScaLBL_D3Q7_AAeven_Poisson(int *Map, double *dist, double *
if (n<finish) {
//Load data
rho_e = Den_charge[n];
rho_e = rho_e/epsilon_LB;
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and thus the net space charge density is zero.
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
idx=Map[n];
psi = Psi[idx];
@@ -269,6 +271,545 @@ __global__ void dvc_ScaLBL_D3Q7_Poisson_Init(int *Map, double *dist, double *Ps
}
}
__global__ void dvc_ScaLBL_D3Q19_AAeven_Poisson_ElectricPotential(
int *Map, double *dist, double *Den_charge, double *Psi, double epsilon_LB, bool UseSlippingVelBC, int start, int finish, int Np) {
int n;
double psi,sum; //electric potential
double rho_e; //local charge density
double f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15,
f16, f17, f18;
double Gs;
int idx;
for (n = start; n < finish; n++) {
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
//........................................................................
// q=0
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
f7 = dist[8 * Np + n];
f8 = dist[7 * Np + n];
f9 = dist[10 * Np + n];
f10 = dist[9 * Np + n];
f11 = dist[12 * Np + n];
f12 = dist[11 * Np + n];
f13 = dist[14 * Np + n];
f14 = dist[13 * Np + n];
f15 = dist[16 * Np + n];
f16 = dist[15 * Np + n];
f17 = dist[18 * Np + n];
f18 = dist[17 * Np + n];
psi = f0 + f2 + f1 + f4 + f3 + f6 + f5 + f8 + f7 + f10 + f9 + f12 +
f11 + f14 + f13 + f16 + f15 + f18 + f17;
idx = Map[n];
Psi[idx] = psi - 0.5*rho_e;
}
}
__global__ void dvc_ScaLBL_D3Q19_AAodd_Poisson(int *neighborList, int *Map,
double *dist, double *Den_charge,
double *Psi, double *ElectricField,
double tau, double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
int n;
double psi; //electric potential
double Ex, Ey, Ez; //electric field
double rho_e; //local charge density
double f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15,
f16, f17, f18;
int nr1, nr2, nr3, nr4, nr5, nr6, nr7, nr8, nr9, nr10, nr11, nr12, nr13,
nr14, nr15, nr16, nr17, nr18;
double sum_q;
double rlx = 1.0 / tau;
int idx;
double W0 = 0.5;
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and thus the net space charge density is zero.
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
nr2 = neighborList[n + Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n + 2 * Np]; // neighbor 4
f3 = dist[nr3];
// q = 4
nr4 = neighborList[n + 3 * Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n + 4 * Np];
f5 = dist[nr5];
// q = 6
nr6 = neighborList[n + 5 * Np];
f6 = dist[nr6];
// q=7
nr7 = neighborList[n + 6 * Np];
f7 = dist[nr7];
// q = 8
nr8 = neighborList[n + 7 * Np];
f8 = dist[nr8];
// q=9
nr9 = neighborList[n + 8 * Np];
f9 = dist[nr9];
// q = 10
nr10 = neighborList[n + 9 * Np];
f10 = dist[nr10];
// q=11
nr11 = neighborList[n + 10 * Np];
f11 = dist[nr11];
// q=12
nr12 = neighborList[n + 11 * Np];
f12 = dist[nr12];
// q=13
nr13 = neighborList[n + 12 * Np];
f13 = dist[nr13];
// q=14
nr14 = neighborList[n + 13 * Np];
f14 = dist[nr14];
// q=15
nr15 = neighborList[n + 14 * Np];
f15 = dist[nr15];
// q=16
nr16 = neighborList[n + 15 * Np];
f16 = dist[nr16];
// q=17
//fq = dist[18*Np+n];
nr17 = neighborList[n + 16 * Np];
f17 = dist[nr17];
// q=18
nr18 = neighborList[n + 17 * Np];
f18 = dist[nr18];
sum_q = f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15+f16+f17+f18;
//error = 8.0*(sum_q - f0) + rho_e;
psi = 2.0*(f0*(1.0 - rlx) + rlx*(sum_q + 0.125*rho_e));
idx = Map[n];
Psi[idx] = psi;
Ex = (f1 - f2 + 0.5*(f7 - f8 + f9 - f10 + f11 - f12 + f13 - f14))*4.0; //NOTE the unit of electric field here is V/lu
Ey = (f3 - f4 + 0.5*(f7 - f8 - f9 + f10 + f15 - f16 + f17 - f18))*4.0;
Ez = (f5 - f6 + 0.5*(f11 - f12 - f13 + f14 + f15 - f16 - f17 + f18))*4.0;
ElectricField[n + 0 * Np] = Ex;
ElectricField[n + 1 * Np] = Ey;
ElectricField[n + 2 * Np] = Ez;
// q = 0
dist[n] = W0*psi; //f0 * (1.0 - rlx) - (1.0-0.5*rlx)*W0*rho_e;
// q = 1
dist[nr2] = W1*psi; //f1 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 2
dist[nr1] = W1*psi; //f2 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 3
dist[nr4] = W1*psi; //f3 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 4
dist[nr3] = W1*psi; //f4 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 5
dist[nr6] = W1*psi; //f5 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 6
dist[nr5] = W1*psi; //f6 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
//........................................................................
// q = 7
dist[nr8] = W2*psi; //f7 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 8
dist[nr7] = W2*psi; //f8 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 9
dist[nr10] = W2*psi; //f9 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 10
dist[nr9] = W2*psi; //f10 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 11
dist[nr12] = W2*psi; //f11 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 12
dist[nr11] = W2*psi; //f12 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 13
dist[nr14] = W2*psi; //f13 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q= 14
dist[nr13] = W2*psi; //f14 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 15
dist[nr16] = W2*psi; //f15 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 16
dist[nr15] = W2*psi; //f16 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 17
dist[nr18] = W2*psi; //f17 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
// q = 18
dist[nr17] = W2*psi; //f18 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
}
}
}
__global__ void dvc_ScaLBL_D3Q19_AAeven_Poisson(int *Map, double *dist,
double *Den_charge, double *Psi,
double *ElectricField, double *Error, double tau,
double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
int n;
double psi; //electric potential
double Ex, Ey, Ez; //electric field
double rho_e; //local charge density
double f0, f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13, f14, f15,
f16, f17, f18;
double error,sum_q;
double rlx = 1.0 / tau;
int idx;
double W0 = 0.5;
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
//When Helmholtz-Smoluchowski slipping velocity BC is used, the bulk fluid is considered as electroneutral
//and thus the net space charge density is zero.
rho_e = (UseSlippingVelBC==1) ? 0.0 : Den_charge[n] / epsilon_LB;
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
f7 = dist[8 * Np + n];
f8 = dist[7 * Np + n];
f9 = dist[10 * Np + n];
f10 = dist[9 * Np + n];
f11 = dist[12 * Np + n];
f12 = dist[11 * Np + n];
f13 = dist[14 * Np + n];
f14 = dist[13 * Np + n];
f15 = dist[16 * Np + n];
f16 = dist[15 * Np + n];
f17 = dist[18 * Np + n];
f18 = dist[17 * Np + n];
/* Ex = (f1 - f2) * rlx *
4.0; //NOTE the unit of electric field here is V/lu
Ey = (f3 - f4) * rlx *
4.0; //factor 4.0 is D3Q7 lattice squared speed of sound
Ez = (f5 - f6) * rlx * 4.0;
*/
Ex = (f1 - f2 + 0.5*(f7 - f8 + f9 - f10 + f11 - f12 + f13 - f14))*4.0; //NOTE the unit of electric field here is V/lu
Ey = (f3 - f4 + 0.5*(f7 - f8 - f9 + f10 + f15 - f16 + f17 - f18))*4.0;
Ez = (f5 - f6 + 0.5*(f11 - f12 - f13 + f14 + f15 - f16 - f17 + f18))*4.0;
ElectricField[n + 0 * Np] = Ex;
ElectricField[n + 1 * Np] = Ey;
ElectricField[n + 2 * Np] = Ez;
sum_q = f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15+f16+f17+f18;
error = 8.0*(sum_q - f0) + rho_e;
Error[n] = error;
psi = 2.0*(f0*(1.0 - rlx) + rlx*(sum_q + 0.125*rho_e));
idx = Map[n];
Psi[idx] = psi;
// q = 0
dist[n] = W0*psi;//
// q = 1
dist[1 * Np + n] = W1*psi;//f1 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 2
dist[2 * Np + n] = W1*psi;//f2 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 3
dist[3 * Np + n] = W1*psi;//f3 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 4
dist[4 * Np + n] = W1*psi;//f4 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 5
dist[5 * Np + n] = W1*psi;//f5 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
// q = 6
dist[6 * Np + n] = W1*psi;//f6 * (1.0 - rlx) +W1* (rlx * psi) - (1.0-0.5*rlx)*0.05555555555555555*rho_e;
dist[7 * Np + n] = W2*psi;//f7 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[8 * Np + n] = W2*psi;//f8* (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[9 * Np + n] = W2*psi;//f9 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[10 * Np + n] = W2*psi;//f10 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[11 * Np + n] = W2*psi;//f11 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[12 * Np + n] = W2*psi;//f12 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[13 * Np + n] = W2*psi;//f13 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[14 * Np + n] = W2*psi;//f14 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[15 * Np + n] = W2*psi;//f15 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[16 * Np + n] = W2*psi;//f16 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[17 * Np + n] = W2*psi;//f17 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
dist[18 * Np + n] = W2*psi;//f18 * (1.0 - rlx) +W2* (rlx * psi) - (1.0-0.5*rlx)*0.02777777777777778*rho_e;
//........................................................................
}
}
}
__global__ void dvc_ScaLBL_D3Q19_Poisson_Init(int *Map, double *dist, double *Psi,
int start, int finish, int Np) {
int n;
int ijk;
double W0 = 0.5;
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
ijk = Map[n];
dist[0 * Np + n] = W0 * Psi[ijk];//3333333333333333* Psi[ijk];
dist[1 * Np + n] = W1 * Psi[ijk];
dist[2 * Np + n] = W1 * Psi[ijk];
dist[3 * Np + n] = W1 * Psi[ijk];
dist[4 * Np + n] = W1 * Psi[ijk];
dist[5 * Np + n] = W1 * Psi[ijk];
dist[6 * Np + n] = W1 * Psi[ijk];
dist[7 * Np + n] = W2* Psi[ijk];
dist[8 * Np + n] = W2* Psi[ijk];
dist[9 * Np + n] = W2* Psi[ijk];
dist[10 * Np + n] = W2* Psi[ijk];
dist[11 * Np + n] = W2* Psi[ijk];
dist[12 * Np + n] = W2* Psi[ijk];
dist[13 * Np + n] = W2* Psi[ijk];
dist[14 * Np + n] = W2* Psi[ijk];
dist[15 * Np + n] = W2* Psi[ijk];
dist[16 * Np + n] = W2* Psi[ijk];
dist[17 * Np + n] = W2* Psi[ijk];
dist[18 * Np + n] = W2* Psi[ijk];
}
}
}
__global__ void dvc_ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_z(int *list, double *dist, double Vin, int count, int Np) {
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
int n = list[idx];
dist[6 * Np + n] = W1*Vin;
dist[12 * Np + n] = W2*Vin;
dist[13 * Np + n] = W2*Vin;
dist[16 * Np + n] = W2*Vin;
dist[17 * Np + n] = W2*Vin;
}
}
__global__ void dvc_ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_Z(int *list, double *dist, double Vout, int count, int Np) {
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
int n = list[idx];
dist[5 * Np + n] = W1*Vout;
dist[11 * Np + n] = W2*Vout;
dist[14 * Np + n] = W2*Vout;
dist[15 * Np + n] = W2*Vout;
dist[18 * Np + n] = W2*Vout;
}
}
__global__ void dvc_ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_z(int *d_neighborList, int *list, double *dist, double Vin, int count, int Np) {
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int nr5, nr11, nr14, nr15, nr18;
int idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
int n = list[idx];
// Unknown distributions
nr5 = d_neighborList[n + 4 * Np];
nr11 = d_neighborList[n + 10 * Np];
nr15 = d_neighborList[n + 14 * Np];
nr14 = d_neighborList[n + 13 * Np];
nr18 = d_neighborList[n + 17 * Np];
dist[nr5] = W1*Vin;
dist[nr11] = W2*Vin;
dist[nr15] = W2*Vin;
dist[nr14] = W2*Vin;
dist[nr18] = W2*Vin;
}
}
__global__ void dvc_ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_Z(int *d_neighborList, int *list, double *dist, double Vout, int count, int Np) {
double W1 = 1.0/24.0;
double W2 = 1.0/48.0;
int nr6, nr12, nr13, nr16, nr17;
int idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
int n = list[idx];
// unknown distributions
nr6 = d_neighborList[n + 5 * Np];
nr12 = d_neighborList[n + 11 * Np];
nr16 = d_neighborList[n + 15 * Np];
nr17 = d_neighborList[n + 16 * Np];
nr13 = d_neighborList[n + 12 * Np];
dist[nr6] = W1*Vout;
dist[nr12] = W2*Vout;
dist[nr16] = W2*Vout;
dist[nr17] = W2*Vout;
dist[nr13] = W2*Vout;
}
}
/* wrapper functions to launch kernels */
extern "C" void ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_z(int *list, double *dist, double Vin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_z<<<GRID,512>>>(list, dist, Vin, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
}
//
extern "C" void ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_Z(int *list, double *dist, double Vout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_Z<<<GRID,512>>>(list, dist, Vout, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_AAeven_Poisson_Potential_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_z(int *d_neighborList, int *list, double *dist, double Vin, int count,int Np) {
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_z<<<GRID,512>>>(d_neighborList, list, dist, Vin, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_z (kernel): %s \n",cudaGetErrorString(err));
}
}
//
extern "C" void ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_Z(int *d_neighborList, int *list, double *dist, double Vout, int count, int Np) {
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, Vout, count, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_AAodd_Poisson_Potential_BC_Z (kernel): %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q19_AAodd_Poisson(int *neighborList, int *Map,
double *dist, double *Den_charge,
double *Psi, double *ElectricField,
double tau, double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
//cudaProfilerStart();
dvc_ScaLBL_D3Q19_AAodd_Poisson<<<NBLOCKS,NTHREADS >>>(neighborList, Map,
dist, Den_charge, Psi, ElectricField, tau, epsilon_LB, UseSlippingVelBC, start, finish, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q19_AAodd_Poisson: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q19_AAeven_Poisson(int *Map, double *dist,
double *Den_charge, double *Psi,
double *ElectricField, double *Error, double tau,
double epsilon_LB, bool UseSlippingVelBC,
int start, int finish, int Np) {
dvc_ScaLBL_D3Q19_AAeven_Poisson<<<NBLOCKS,NTHREADS >>>( Map, dist, Den_charge, Psi,
ElectricField, Error, tau, epsilon_LB, UseSlippingVelBC, start, finish, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q19_AAeven_Poisson: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q19_Poisson_Init(int *Map, double *dist, double *Psi,
int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q19_Poisson_Init<<<NBLOCKS,NTHREADS >>>(Map, dist, Psi, start, finish, Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_Poisson_Init: %s \n",cudaGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson_ElectricPotential(int *neighborList,int *Map, double *dist, double *Psi, int start, int finish, int Np){
//cudaProfilerStart();
@@ -293,10 +834,10 @@ extern "C" void ScaLBL_D3Q7_AAeven_Poisson_ElectricPotential(int *Map, double *d
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map, double *dist, double *Den_charge, double *Psi, double *ElectricField, double tau, double epsilon_LB,int start, int finish, int Np){
extern "C" void ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map, double *dist, double *Den_charge, double *Psi, double *ElectricField, double tau, double epsilon_LB,bool UseSlippingVelBC,int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAodd_Poisson<<<NBLOCKS,NTHREADS >>>(neighborList,Map,dist,Den_charge,Psi,ElectricField,tau,epsilon_LB,start,finish,Np);
dvc_ScaLBL_D3Q7_AAodd_Poisson<<<NBLOCKS,NTHREADS >>>(neighborList,Map,dist,Den_charge,Psi,ElectricField,tau,epsilon_LB,UseSlippingVelBC,start,finish,Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
@@ -305,10 +846,10 @@ extern "C" void ScaLBL_D3Q7_AAodd_Poisson(int *neighborList, int *Map, double *d
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_AAeven_Poisson(int *Map, double *dist, double *Den_charge, double *Psi, double *ElectricField, double tau, double epsilon_LB,int start, int finish, int Np){
extern "C" void ScaLBL_D3Q7_AAeven_Poisson(int *Map, double *dist, double *Den_charge, double *Psi, double *ElectricField, double tau, double epsilon_LB,bool UseSlippingVelBC,int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAeven_Poisson<<<NBLOCKS,NTHREADS >>>(Map,dist,Den_charge,Psi,ElectricField,tau,epsilon_LB,start,finish,Np);
dvc_ScaLBL_D3Q7_AAeven_Poisson<<<NBLOCKS,NTHREADS >>>(Map,dist,Den_charge,Psi,ElectricField,tau,epsilon_LB,UseSlippingVelBC,start,finish,Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){

View File

@@ -5,7 +5,7 @@
#define NBLOCKS 1024
#define NTHREADS 256
__global__ void dvc_ScaLBL_D3Q19_AAodd_StokesMRT(int *neighborList, double *dist, double *Velocity, double *ChargeDensity, double *ElectricField, double rlx_setA, double rlx_setB, double Gx, double Gy, double Gz, double rho0, double den_scale, double h, double time_conv,int start, int finish, int Np){
__global__ void dvc_ScaLBL_D3Q19_AAodd_StokesMRT(int *neighborList, double *dist, double *Velocity, double *ChargeDensity, double *ElectricField, double rlx_setA, double rlx_setB, double Gx, double Gy, double Gz, double rho0, double den_scale, double h, double time_conv,bool UseSlippingVelBC,int start, int finish, int Np){
int n;
double fq;
@@ -46,9 +46,12 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_StokesMRT(int *neighborList, double *dis
Ey = ElectricField[n+1*Np];
Ez = ElectricField[n+2*Np];
//compute total body force, including input body force (Gx,Gy,Gz)
Fx = Gx + rhoE*Ex*(time_conv*time_conv)/(h*h*1.0e-12)/den_scale;
Fy = Gy + rhoE*Ey*(time_conv*time_conv)/(h*h*1.0e-12)/den_scale;
Fz = Gz + rhoE*Ez*(time_conv*time_conv)/(h*h*1.0e-12)/den_scale;
Fx = (UseSlippingVelBC==1) ? Gx : Gx + rhoE * Ex * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale; //the extra factors at the end necessarily convert unit from phys to LB
Fy = (UseSlippingVelBC==1) ? Gy : Gy + rhoE * Ey * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
Fz = (UseSlippingVelBC==1) ? Gz : Gz + rhoE * Ez * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
// q=0
fq = dist[n];
@@ -510,7 +513,7 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_StokesMRT(int *neighborList, double *dis
}
}
__global__ void dvc_ScaLBL_D3Q19_AAeven_StokesMRT(double *dist, double *Velocity, double *ChargeDensity, double *ElectricField, double rlx_setA, double rlx_setB, double Gx, double Gy, double Gz,double rho0, double den_scale, double h, double time_conv, int start, int finish, int Np){
__global__ void dvc_ScaLBL_D3Q19_AAeven_StokesMRT(double *dist, double *Velocity, double *ChargeDensity, double *ElectricField, double rlx_setA, double rlx_setB, double Gx, double Gy, double Gz,double rho0, double den_scale, double h, double time_conv, bool UseSlippingVelBC, int start, int finish, int Np){
int n;
double fq;
@@ -550,9 +553,12 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_StokesMRT(double *dist, double *Velocit
Ey = ElectricField[n+1*Np];
Ez = ElectricField[n+2*Np];
//compute total body force, including input body force (Gx,Gy,Gz)
Fx = Gx + rhoE*Ex*(time_conv*time_conv)/(h*h*1.0e-12)/den_scale;//the extra factors at the end necessarily convert unit from phys to LB
Fy = Gy + rhoE*Ey*(time_conv*time_conv)/(h*h*1.0e-12)/den_scale;
Fz = Gz + rhoE*Ez*(time_conv*time_conv)/(h*h*1.0e-12)/den_scale;
Fx = (UseSlippingVelBC==1) ? Gx : Gx + rhoE * Ex * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale; //the extra factors at the end necessarily convert unit from phys to LB
Fy = (UseSlippingVelBC==1) ? Gy : Gy + rhoE * Ey * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
Fz = (UseSlippingVelBC==1) ? Gz : Gz + rhoE * Ez * (time_conv * time_conv) / (h * h * 1.0e-12) /
den_scale;
// q=0
fq = dist[n];
@@ -969,10 +975,10 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_StokesMRT(double *dist, double *Velocit
}
}
extern "C" void ScaLBL_D3Q19_AAodd_StokesMRT(int *neighborList, double *dist, double *Velocity, double *ChargeDensity, double *ElectricField, double rlx_setA, double rlx_setB, double Gx, double Gy, double Gz,double rho0, double den_scale, double h, double time_conv, int start, int finish, int Np){
extern "C" void ScaLBL_D3Q19_AAodd_StokesMRT(int *neighborList, double *dist, double *Velocity, double *ChargeDensity, double *ElectricField, double rlx_setA, double rlx_setB, double Gx, double Gy, double Gz,double rho0, double den_scale, double h, double time_conv, bool UseSlippingVelBC, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q19_AAodd_StokesMRT<<<NBLOCKS,NTHREADS >>>(neighborList,dist,Velocity,ChargeDensity,ElectricField,rlx_setA,rlx_setB,Gx,Gy,Gz,rho0,den_scale,h,time_conv,start,finish,Np);
dvc_ScaLBL_D3Q19_AAodd_StokesMRT<<<NBLOCKS,NTHREADS >>>(neighborList,dist,Velocity,ChargeDensity,ElectricField,rlx_setA,rlx_setB,Gx,Gy,Gz,rho0,den_scale,h,time_conv,UseSlippingVelBC,start,finish,Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){
@@ -981,10 +987,10 @@ extern "C" void ScaLBL_D3Q19_AAodd_StokesMRT(int *neighborList, double *dist, do
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q19_AAeven_StokesMRT(double *dist, double *Velocity, double *ChargeDensity, double *ElectricField, double rlx_setA, double rlx_setB, double Gx, double Gy, double Gz,double rho0, double den_scale, double h, double time_conv, int start, int finish, int Np){
extern "C" void ScaLBL_D3Q19_AAeven_StokesMRT(double *dist, double *Velocity, double *ChargeDensity, double *ElectricField, double rlx_setA, double rlx_setB, double Gx, double Gy, double Gz,double rho0, double den_scale, double h, double time_conv, bool UseSlippingVelBC, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q19_AAeven_StokesMRT<<<NBLOCKS,NTHREADS >>>(dist,Velocity,ChargeDensity,ElectricField,rlx_setA,rlx_setB,Gx,Gy,Gz,rho0,den_scale,h,time_conv,start,finish,Np);
dvc_ScaLBL_D3Q19_AAeven_StokesMRT<<<NBLOCKS,NTHREADS >>>(dist,Velocity,ChargeDensity,ElectricField,rlx_setA,rlx_setB,Gx,Gy,Gz,rho0,den_scale,h,time_conv,UseSlippingVelBC,start,finish,Np);
cudaError_t err = cudaGetLastError();
if (cudaSuccess != err){

View File

@@ -0,0 +1,137 @@
*************************
Measuring Contact Angles
*************************
LBPM includes specialized data analysis capabilities for two-fluid systems. While these
components are generally designed for in situ analysis of simulation data, they can also
be applied independently to analyze 3D image data. In this example we consider applying
the analysis tools implemented in ``lbpm_TwoPhase_analysis``, which are designed to
analyze two-fluid configurations in porous media. The numerical implementation used to
construct the common line are described in ( https://doi.org/10.1016/j.advwatres.2006.06.010 ).
Methods used to measure the contact angle are described in ( https://doi.org/10.1017/jfm.2016.212 ).
Source files for the example are included in the LBPM repository
in the directory ``examples/Droplet``. A simple python code is included
to set up a fluid droplet on a flat surface
.. code:: python
import numpy as np
import matplotlib.pylab as plt
D=np.ones((80,80,40),dtype="uint8")
cx = 40
cy = 40
cz = 0
for i in range(0,80):
for j in range (0, 80):
for k in range (0,40):
dist = np.sqrt((i-cx)*(i-cx) + (j-cx)*(j-cx) + (k-cz)*(k-cz))
if (dist < 25) :
D[i,j,k] = 2
if k < 4 :
D[i,j,k] = 0
D.tofile("droplet_40x80x80.raw")
The input file provided below will specify how the analysis should be performed. The name and the dimensions of the input
file are provided in the ``Domain`` section, as with other LBPM simulators. For large images, additional processors can be
used to speed up the analysis or take advantage of distributed memory. The ``ReadValues`` list should be used to specify the
labels to use for analysis. The first label will be taken to be the solid. The second label will be taken to be the fluid
to analyze, which in this case will be the droplet labeled with ``2`` above.
.. code:: bash
Domain {
Filename = "droplet_40x80x80.raw"
nproc = 1, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 40, 80, 80 // Size of local domain (Nx,Ny,Nz)
N = 40, 80, 80 // size of the input image
voxel_length = 1.0
BC = 0 // Boundary condition type
ReadType = "8bit"
ReadValues = 0, 2, 1
WriteValues = 0, 2, 1
}
Visualization {
}
The analysis can be launched as ``mpirun -np 1 $LBPM_DIR/lbpm_TwoPhase_analysis input.db``. Output should appear as
follows:
.. code:: bash
Input data file: input.db
voxel length = 1.000000 micron
Input media: droplet_40x80x80.raw
Relabeling 3 values
oldvalue=0, newvalue =0
oldvalue=2, newvalue =2
oldvalue=1, newvalue =1
Dimensions of segmented image: 40 x 80 x 80
Reading 8-bit input data
Read segmented data from droplet_40x80x80.raw
Label=0, Count=25600
Label=2, Count=25773
Label=1, Count=204627
Distributing subdomains across 1 processors
Process grid: 1 x 1 x 1
Subdomain size: 40 x 80 x 80
Size of transition region: 0
Media porosity = 0.900000
Initialized solid phase -- Converting to Signed Distance function
Initialized fluid phase -- Converting to Signed Distance function
Computing Minkowski functionals
The ``TwoPhase`` analysis class will generate signed distance functions for the solid and fluid surfaces.
Using the distance functions, the interfaces and common line will be constructed. Contact angles are logged
for each processor, e.g. ``ContactAngle.00000.csv``, which specifies the x, y, z coordinates for each measurement
along with the cosine of the contact angle. Averaged measures (determined over the entire input image)
are logged to ``geometry.csv``
* ``sw`` -- water saturation
* ``awn`` -- surface area of meniscus between wn fluids
* ``ans`` -- surface area between fluid n and solid
* ``aws`` -- surface area between fluid w and solid
* ``Jwn`` -- integral of mean curvature of meniscus
* ``Kwn`` -- integral of Gaussian curvature of meniscus
* ``lwns`` -- length of common line
* ``cwns`` -- average contact angle
* ``KGws`` -- geodesic curvature of common line relative to ws surface
* ``KGwn`` -- geodesic curvature of common line relative to wn surface
* ``Gwnxx`` -- orientation tensor component for wn surface
* ``Gwnyy`` -- orientation tensor component for wn surface
* ``Gwnzz`` -- orientation tensor component for wn surface
* ``Gwnxy`` -- orientation tensor component for wn surface
* ``Gwnxz`` -- orientation tensor component for wn surface
* ``Gwnyz`` -- orientation tensor component for wn surface
* ``Gwsxx`` -- orientation tensor component for ws surface
* ``Gwsyy`` -- orientation tensor component for ws surface
* ``Gwszz`` -- orientation tensor component for ws surface
* ``Gwsxy`` -- orientation tensor component for ws surface
* ``Gwsxz`` -- orientation tensor component for ws surface
* ``Gwsyz`` -- orientation tensor component for ws surface
* ``Gnsxx`` -- orientation tensor component for ns surface
* ``Gnsyy`` -- orientation tensor component for ns surface
* ``Gnszz`` -- orientation tensor component for ns surface
* ``Gnsxy`` -- orientation tensor component for ns surface
* ``Gnsxz`` -- orientation tensor component for ns surface
* ``Gnsyz`` -- orientation tensor component for ns surface
* ``trawn`` -- trimmed surface area for meniscus (one voxel from solid)
* ``trJwn`` -- mean curvature for trimmed meniscus
* ``trRwn`` -- radius of curvature for trimmed meniscus
* ``Vw`` -- volume of fluid w
* ``Aw`` -- boundary surface area for fluid w
* ``Jw`` -- integral of mean curvature for fluid w
* ``Xw`` -- Euler characteristic for fluid w
* ``Vn`` -- volume of fluid n
* ``An`` -- boundary surface area for fluid n
* ``Jn`` -- integral of mean curvature for fluid n
* ``Xn`` -- Euler characteristic for fluid n

View File

@@ -0,0 +1,125 @@
********************************
Steady-state fractional flow
********************************
In this example we simulate a steady-state flow with a constant driving force. This will enforce a periodic boundary condition
in all directions. While the driving force may be set in any direction, we will set it in the z-direction to be consistent
with the convention for pressure and velocity boundary conditions.
For the case considered in ``example/DiscPack`` we specify the following information in the input file
.. code:: c
Domain {
Filename = "discs_3x128x128.raw.morphdrain.raw"
ReadType = "8bit" // data type
N = 3, 128, 128 // size of original image
nproc = 1, 2, 2 // process grid
n = 3, 64, 64 // sub-domain size
voxel_length = 1.0 // voxel length (in microns)
ReadValues = 0, 1, 2 // labels within the original image
WriteValues = 0, 1, 2 // associated labels to be used by LBPM
BC = 0 // fully periodic BC
Sw = 0.35 // target saturation for morphological tools
}
Color {
protocol = "fractional flow"
capillary_number = -1e-5 // capillary number for the displacement, positive="oil injection"
timestepMax = 200000 // maximum timtestep
alpha = 0.01 // controls interfacial tension
rhoA = 1.0 // controls the density of fluid A
rhoB = 1.0 // controls the density of fluid B
tauA = 0.7 // controls the viscosity of fluid A
tauB = 0.7 // controls the viscosity of fluid B
F = 0, 0, 1e-5 // body force
WettingConvention = "SCAL"
ComponentLabels = 0 // image labels for solid voxels
ComponentAffinity = 0.9 // controls the wetting affinity for each label
Restart = false
}
Analysis {
analysis_interval = 1000 // logging interval for timelog.csv
subphase_analysis_interval = 500000 // loggging interval for subphase.csv
N_threads = 4 // number of analysis threads (GPU version only)
visualization_interval = 10000 // interval to write visualization files
restart_interval = 10000000 // interval to write restart file
restart_file = "Restart" // base name of restart file
}
Visualization {
format = "hdf5"
write_silo = true // write SILO databases with assigned variables
save_8bit_raw = true // write labeled 8-bit binary files with phase assignments
save_phase_field = true // save phase field within SILO database
save_pressure = true // save pressure field within SILO database
save_velocity = false // save velocity field within SILO database
}
FlowAdaptor {
max_steady_timesteps = 25000 // maximum number of timesteps per steady point
min_steady_timesteps = 25000 // minimum number of timesteps per steady point
fractional_flow_increment = 0.0003 // parameter that controls rate of mass seeding
skip_timesteps = 10000 // number of timesteps to spend in flow adaptor
}
Once this has been set, we launch ``lbpm_color_simulator`` in the same way as other parallel tools
.. code:: bash
mpirun -np 4 $LBPM_BIN/lbpm_color_simulator input.db
Successful output looks like the following
.. code:: bash
********************************************************
Running Color LBM
********************************************************
voxel length = 1.000000 micron
voxel length = 1.000000 micron
Input media: discs_3x128x128.raw.morphdrain.raw
Relabeling 3 values
oldvalue=0, newvalue =0
oldvalue=1, newvalue =1
oldvalue=2, newvalue =2
Dimensions of segmented image: 3 x 128 x 128
Reading 8-bit input data
Read segmented data from discs_3x128x128.raw.morphdrain.raw
Label=0, Count=11862
Label=1, Count=26430
Label=2, Count=10860
Distributing subdomains across 4 processors
Process grid: 1 x 2 x 2
Subdomain size: 3 x 64 x 64
Size of transition region: 0
Media porosity = 0.758667
Initialized solid phase -- Converting to Signed Distance function
Domain set.
Create ScaLBL_Communicator
Set up memory efficient layout, 9090 | 9120 | 21780
Allocating distributions
Setting up device map and neighbor list
Component labels: 1
label=0, affinity=-0.900000, volume fraction==0.417582
Initializing distributions
Initializing phase field
Affinities - rank 0:
Main: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 1: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 2: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 3: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 4: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Affinities - rank 0:
Main: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 1: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 2: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 3: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 4: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
********************************************************
CPU time = 0.001501
Lattice update rate (per core)= 6.074861 MLUPS
Lattice update rate (per MPI process)= 6.074861 MLUPS
(flatten density field)

View File

@@ -1,5 +1,126 @@
*******************
Steady-state flow
*******************
********************************
Steady-state flow (color model)
********************************
In this example we simulate a steady-state flow with a constant driving force. This will enforce a periodic boundary condition
in all directions. While the driving force may be set in any direction, we will set it in the z-direction to be consistent
with the convention for pressure and velocity boundary conditions.
For the case considered in ``example/DiscPack`` we specify the following information in the input file
.. code:: c
Domain {
Filename = "discs_3x128x128.raw.morphdrain.raw"
ReadType = "8bit" // data type
N = 3, 128, 128 // size of original image
nproc = 1, 2, 2 // process grid
n = 3, 64, 64 // sub-domain size
voxel_length = 1.0 // voxel length (in microns)
ReadValues = 0, 1, 2 // labels within the original image
WriteValues = 0, 1, 2 // associated labels to be used by LBPM
BC = 0 // fully periodic BC
Sw = 0.35 // target saturation for morphological tools
}
Color {
protocol = "fractional flow"
capillary_number = 1e-5 // capillary number for the displacement, positive="oil injection"
timestepMax = 500000 // maximum timtestep
alpha = 0.005 // controls interfacial tension
rhoA = 1.0 // controls the density of fluid A
rhoB = 1.0 // controls the density of fluid B
tauA = 0.7 // controls the viscosity of fluid A
tauB = 0.7 // controls the viscosity of fluid B
F = 0, 0, 1e-5 // body force
WettingConvention = "SCAL"
ComponentLabels = 0 // image labels for solid voxels
ComponentAffinity = 0.9 // controls the wetting affinity for each label
Restart = false
}
Analysis {
analysis_interval = 1000 // logging interval for timelog.csv
subphase_analysis_interval = 500000 // loggging interval for subphase.csv
N_threads = 4 // number of analysis threads (GPU version only)
visualization_interval = 1000000 // interval to write visualization files
restart_interval = 10000000 // interval to write restart file
restart_file = "Restart" // base name of restart file
}
Visualization {
format = "hdf5"
write_silo = true // write SILO databases with assigned variables
save_8bit_raw = true // write labeled 8-bit binary files with phase assignments
save_phase_field = true // save phase field within SILO database
save_pressure = true // save pressure field within SILO database
save_velocity = false // save velocity field within SILO database
}
FlowAdaptor {
min_steady_timesteps = 250000 // minimum number of timesteps per steady point
max_steady_timesteps = 300000 // maximum number of timesteps per steady point
fractional_flow_increment = 0.1 // parameter that controls rate of mass seeding
skip_timesteps = 10000 // number of timesteps to spend in flow adaptor
endpoint_threshold = 0.1 // endpoint exit criterion
}
Once this has been set, we launch ``lbpm_color_simulator`` in the same way as other parallel tools
.. code:: bash
mpirun -np 4 $LBPM_BIN/lbpm_color_simulator input.db
Successful output looks like the following
.. code:: bash
********************************************************
Running Color LBM
********************************************************
voxel length = 1.000000 micron
voxel length = 1.000000 micron
Input media: discs_3x128x128.raw.morphdrain.raw
Relabeling 3 values
oldvalue=0, newvalue =0
oldvalue=1, newvalue =1
oldvalue=2, newvalue =2
Dimensions of segmented image: 3 x 128 x 128
Reading 8-bit input data
Read segmented data from discs_3x128x128.raw.morphdrain.raw
Label=0, Count=11862
Label=1, Count=26430
Label=2, Count=10860
Distributing subdomains across 4 processors
Process grid: 1 x 2 x 2
Subdomain size: 3 x 64 x 64
Size of transition region: 0
Media porosity = 0.758667
Initialized solid phase -- Converting to Signed Distance function
Domain set.
Create ScaLBL_Communicator
Set up memory efficient layout, 9090 | 9120 | 21780
Allocating distributions
Setting up device map and neighbor list
Component labels: 1
label=0, affinity=-0.900000, volume fraction==0.417582
Initializing distributions
Initializing phase field
Affinities - rank 0:
Main: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 1: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 2: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 3: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 4: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Affinities - rank 0:
Main: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 1: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 2: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 3: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Thread 4: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
********************************************************
CPU time = 0.001501
Lattice update rate (per core)= 6.074861 MLUPS
Lattice update rate (per MPI process)= 6.074861 MLUPS
(flatten density field)
In this example we simulate a steady-state flow with a constant driving force.

View File

@@ -0,0 +1,213 @@
********************************
Membrane Charging Dynamics
********************************
In this example, we consider membrane charging dynamics for a simple cell.
For the case considered in ``example/SingleCell`` an input membrane geometry is provided in the
file ``Bacterium.swc``, which specifies an oblong cell shape, relying on the ``.swc`` file format that
is commonly used to approximate neuron structures. The case considered is the four ion membrane transport
problem considered in Figure 4 from McClure & Li
The cell simulation is performed by the executable ``lbpm_nernst_planck_cell_simulator``, which is launched
in the same way as other parallel tools
.. code:: bash
mpirun -np 2 $LBPM_BIN/lbpm_nernst_planck_cell_simulator Bacterium.db
The input file ``Bacterium.db`` specifies the following
.. code:: c
MultiphysController {
timestepMax = 25000
num_iter_Ion_List = 4
analysis_interval = 100
tolerance = 1.0e-9
visualization_interval = 1000 // Frequency to write visualization data
}
Ions {
use_membrane = true
Restart = false
MembraneIonConcentrationList = 150.0e-3, 10.0e-3, 15.0e-3, 155.0e-3 //user-input unit: [mol/m^3]
temperature = 293.15 //unit [K]
number_ion_species = 4 //number of ions
tauList = 1.0, 1.0, 1.0, 1.0
IonDiffusivityList = 1.0e-9, 1.0e-9, 1.0e-9, 1.0e-9 //user-input unit: [m^2/sec]
IonValenceList = 1, -1, 1, -1 //valence charge of ions; dimensionless; positive/negative integer
IonConcentrationList = 4.0e-3, 20.0e-3, 16.0e-3, 0.0e-3 //user-input unit: [mol/m^3]
BC_Solid = 0 //solid boundary condition; 0=non-flux BC; 1=surface ion concentration
FluidVelDummy = 0.0, 0.0, 0.0 // dummy fluid velocity for debugging
BC_InletList = 0, 0, 0, 0
BC_OutletList = 0, 0, 0, 0
}
Poisson {
lattice_scheme = "D3Q19"
epsilonR = 78.5 //fluid dielectric constant [dimensionless]
BC_Inlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
BC_Outlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
//--------------------------------------------------------------------------
//--------------------------------------------------------------------------
BC_Solid = 2 //solid boundary condition; 1=surface potential; 2=surface charge density
SolidLabels = 0 //solid labels for assigning solid boundary condition
SolidValues = 0 //if surface potential, unit=[V]; if surface charge density, unit=[C/m^2]
WriteLog = true //write convergence log for LB-Poisson solver
//------------------------------ advanced setting ------------------------------------
timestepMax = 4000 //max timestep for obtaining steady-state electrical potential
analysis_interval = 25 //timestep checking steady-state convergence
tolerance = 1.0e-10 //stopping criterion for steady-state solution
InitialValueLabels = 1, 2
InitialValues = 0.0, 0.0
}
Domain {
Filename = "Bacterium.swc"
nproc = 2, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 64, 64, 64 // Size of local domain (Nx,Ny,Nz)
N = 128, 64, 64 // size of the input image
voxel_length = 0.01 //resolution; user-input unit: [um]
BC = 0 // Boundary condition type
ReadType = "swc"
ReadValues = 0, 1, 2
WriteValues = 0, 1, 2
}
Analysis {
analysis_interval = 100
subphase_analysis_interval = 50 // Frequency to perform analysis
restart_interval = 5000 // Frequency to write restart data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 4 // Number of threads to use
load_balance = "independent" // Load balance method to use: "none", "default", "independent"
}
Visualization {
save_electric_potential = true
save_concentration = true
save_velocity = false
}
Membrane {
MembraneLabels = 2
VoltageThreshold = 0.0, 0.0, 0.0, 0.0
MassFractionIn = 1e-1, 1.0, 5e-3, 0.0
MassFractionOut = 1e-1, 1.0, 5e-3, 0.0
ThresholdMassFractionIn = 1e-1, 1.0, 5e-3, 0.0
ThresholdMassFractionOut = 1e-1, 1.0, 5e-3, 0.0
}
*******************
Example Output
*******************
Successful output looks like the following
.. code:: bash
********************************************************
Running LBPM Nernst-Planck Membrane solver
********************************************************
.... Read membrane permeability (MassFractionIn)
.... Read membrane permeability (MassFractionOut)
.... Read membrane permeability (ThresholdMassFractionIn)
.... Read membrane permeability (ThresholdMassFractionOut)
.... Read MembraneIonConcentrationList
voxel length = 0.010000 micron
voxel length = 0.010000 micron
Reading SWC file...
Number of lines in SWC file: 7
Number of lines extracted is: 7
shift swc data by 0.150000, 0.140000, 0.140000
Media porosity = 1.000000
LB Ion Solver: Initialized solid phase & converting to Signed Distance function
Domain set.
LB Ion Solver: Create ScaLBL_Communicator
LB Ion Solver: Set up memory efficient layout
LB Ion Solver: Allocating distributions
LB Ion Solver: Setting up device map and neighbor list
**** Creating membrane data structure ******
Number of active lattice sites (rank = 0): 262160
Membrane labels: 1
label=2, volume fraction = 0.133917
Creating membrane data structure...
Copy initial neighborlist...
Cut membrane links...
(cut 7105 links crossing membrane)
Construct membrane data structures...
Create device data structures...
Construct communication data structures...
Ion model setup complete
Analyze system with sub-domain size = 66 x 66 x 66
Set up analysis routines for 4 ions
LB Ion Solver: initializing D3Q7 distributions
...initializing based on membrane list
.... Set concentration(0): inside=0.15 [mol/m^3], outside=0.004 [mol/m^3]
.... Set concentration(1): inside=0.01 [mol/m^3], outside=0.02 [mol/m^3]
.... Set concentration(2): inside=0.015 [mol/m^3], outside=0.016 [mol/m^3]
.... Set concentration(3): inside=0.155 [mol/m^3], outside=0 [mol/m^3]
LB Ion Solver: initializing charge density
LB Ion Solver: solid boundary: non-flux boundary is assigned
LB Ion Solver: inlet boundary for Ion 1 is periodic
LB Ion Solver: outlet boundary for Ion 1 is periodic
LB Ion Solver: inlet boundary for Ion 2 is periodic
LB Ion Solver: outlet boundary for Ion 2 is periodic
LB Ion Solver: inlet boundary for Ion 3 is periodic
LB Ion Solver: outlet boundary for Ion 3 is periodic
LB Ion Solver: inlet boundary for Ion 4 is periodic
LB Ion Solver: outlet boundary for Ion 4 is periodic
*****************************************************
LB Ion Transport Solver:
Ion 1: LB relaxation tau = 1
Time conversion factor: 1.25e-08 [sec/lt]
Internal iteration: 2 [lt]
Ion 2: LB relaxation tau = 1
Time conversion factor: 1.25e-08 [sec/lt]
Internal iteration: 2 [lt]
Ion 3: LB relaxation tau = 1
Time conversion factor: 1.25e-08 [sec/lt]
Internal iteration: 2 [lt]
Ion 4: LB relaxation tau = 1
Time conversion factor: 1.25e-08 [sec/lt]
Internal iteration: 2 [lt]
*****************************************************
Ion model initialized
Main loop time_conv computed from ion 1: 2.5e-08[s/lt]
Main loop time_conv computed from ion 2: 2.5e-08[s/lt]
Main loop time_conv computed from ion 3: 2.5e-08[s/lt]
Main loop time_conv computed from ion 4: 2.5e-08[s/lt]
***********************************************************************************
LB-Poisson Solver: steady-state MaxTimeStep = 4000; steady-state tolerance = 1e-10
LB relaxation tau = 3.5
***********************************************************************************
LB-Poisson Solver: Use averaged MSE to check solution convergence.
LB-Poisson Solver: Use D3Q19 lattice structure.
voxel length = 0.010000 micron
voxel length = 0.010000 micron
Reading SWC file...
Number of lines in SWC file: 7
Number of lines extracted is: 7
shift swc data by 0.150000, 0.140000, 0.140000
Media porosity = 1.000000
LB-Poisson Solver: Initialized solid phase & converting to Signed Distance function
Domain set.
LB-Poisson Solver: Create ScaLBL_Communicator
LB-Poisson Solver: Set up memory efficient layout
LB-Poisson Solver: Allocating distributions
LB-Poisson Solver: Setting up device map and neighbor list
.... LB-Poisson Solver: check neighbor list
.... LB-Poisson Solver: copy neighbor list to GPU
Poisson solver created
LB-Poisson Solver: initializing D3Q19 distributions
LB-Poisson Solver: number of Poisson solid labels: 1
label=0, surface potential=0 [V], volume fraction=0
LB-Poisson Solver: number of Poisson initial-value labels: 2
label=1, initial potential=0 [V], volume fraction=0.96
label=2, initial potential=0 [V], volume fraction=0.13
POISSON MODEL: Reading restart file!
Poisson solver initialized
... getting Poisson solver error
-------------------------------------------------------------------
set coefficients
********************************************************
CPU time = 0.008526
Lattice update rate (per core)= 30.749833 MLUPS
Lattice update rate (total)= 61.499666 MLUPS
********************************************************

View File

@@ -18,4 +18,7 @@ a basic introduction to working with LBPM.
morphology/*
color/*
analysis/*
membrane/*

View File

@@ -2,12 +2,16 @@
Publications
============
* James E McClure, Zhe Li, Mark Berrill, Thomas Ramstad, "The LBPM software package for simulating multiphase flow on digital images of porous rocks" Computational Geosciences (25) 871895 (2021) https://doi.org/10.1007/s10596-020-10028-9
* J.E. McClure, Z. Li, M. Berrill, T. Ramstad, "The LBPM software package for simulating multiphase flow on digital images of porous rocks" Computational Geosciences (25) 871895 (2021) https://doi.org/10.1007/s10596-020-10028-9
* James E. McClure, Zhe Li, Adrian P. Sheppard, Cass T. Miller, "An adaptive volumetric flux boundary condition for lattice Boltzmann methods" Computers & Fluids (210) (2020) https://doi.org/10.1016/j.compfluid.2020.104670
* J. E. McClure, Z. Li, A.P. Sheppard, C.T. Miller, "An adaptive volumetric flux boundary condition for lattice Boltzmann methods" Computers & Fluids (210) (2020) https://doi.org/10.1016/j.compfluid.2020.104670
* Y.D. Wang, T. Chung, R.T. Armstrong, J. McClure, T. Ramstad, P. Mostaghimi, "Accelerated Computation of Relative Permeability by Coupled Morphological and Direct Multiphase Flow Simulation" Journal of Computational Physics (401) (2020) https://doi.org/10.1016/j.jcp.2019.108966
* J.E. McClure, M. Berrill, W. Gray, C.T. Miller, C.T. "Tracking interface and common curve dynamics for two-fluid flow in porous media. Journal of Fluid Mechanics, 796, 211-232 (2016) https://doi.org/10.1017/jfm.2016.212
* J.E.McClure, D.Adalsteinsson, C.Pan, W.G.Gray, C.T.Miller "Approximation of interfacial properties in multiphase porous medium systems" Advances in Water Resources, 30 (3): 354-365 (2007) https://doi.org/10.1016/j.advwatres.2006.06.010
* J.E. McClure, Z. Li "Capturing membrane structure and function in lattice Boltzmann models" arXiv preprint arXiv:2208.14122 https://arxiv.org/pdf/2208.14122.pdf

View File

@@ -1,6 +0,0 @@
=============================================
Poisson-Boltzmann model
=============================================
The LBPM Poisson-Boltzmann solver is designed to solve the Poisson-Boltzmann equation
to solve for the electric field in an ionic fluid.

View File

@@ -0,0 +1,366 @@
=============================================
Cell model
=============================================
LBPM includes a whole-cell simulator based on a coupled solution of the Nernst-Planck equations with Gauss's law.
The resulting model is fully non-equilibrium, and can resolve the dynamics of how ions diffuse through the cellular
environment when subjected to complex membrane responses.
The lattice Boltzmann formulation is described below.
*********************
Nernst-Planck model
*********************
The Nernst-Planck model is designed to model ion transport based on the
Nernst-Planck equation.
.. math::
:nowrap:
$$
\frac{\partial C_k}{\partial t} + \nabla \cdot \mathbf{j}_k = 0
$$
where
.. math::
:nowrap:
$$
\mathbf{j}_k = C_k \mathbf{u} - D_k \Big( \nabla C_k + \frac{z_k C_k}{V_T} \nabla \psi\Big)
$$
A LBM solution is developed using a three-dimensional, seven velocity (D3Q7) lattice structure for each species. Each distribution is associated with a particular discrete velocity, such that the concentration is given by their sum,
.. math::
:nowrap:
$$
C_k = \sum_{q=0}^{6} f^k_q \;.
$$
Lattice Boltzmann equations (LBEs) are defined to determine the evolution of the distributions
.. math::
:nowrap:
$$
f^{k}_q (\mathbf{x}_n + \bm{\xi}_q \Delta t, t+ \Delta t)-
f^{k}_q (\mathbf{x}_n, t) = \frac{1}{\lambda_k}
\Big( f^{k}_q - f^{eq}_q \Big)\;,
$$
where the relaxation time :math:`\lambda_k` controls the bulk diffusion coefficient,
.. math::
:nowrap:
$$
D_k = c_s^2\Big( \lambda_k - \frac 12\Big)\;.
$$
The speed of sound for the D3Q7 lattice model is :math:`c_s^2 = \frac 14` and the weights are :math:`W_0 = 1/4` and :math:`W_1,\ldots, W_6 = 1/8`.
Equilibrium distributions are established from the fact that molecular velocity distribution follows a Gaussian distribution within the bulk fluids,
.. math::
:nowrap:
$$
f^{eq}_q = W_q C_k \Big[ 1 + \frac{\bm{\xi_q}\cdot \mathbf{u}^\prime}{c_s^2} \Big]\;.
$$
The velocity is given by
.. math::
:nowrap:
$$
\mathbf{u}^\prime = \mathbf{u} - \frac{z_k D_k}{V_T} \nabla \psi \;.
$$
Keys for the Nernst-Planck solver are provided in the ``Ion`` section of the input file database. Supported keys are
- ``use_membrane`` -- set up a membrane structure (defaults to ``true`` if not specified)
- ``Restart`` -- read concentrations from restart file (defaults to ``false`` if not specified)
- ``number_ion_species`` -- number of ions to use in the model
- ``temperature`` -- temperature to use for the thermal voltage (:math:`V_T=k_B T / e`, where the electron charge is :math:`e=1.6\times10^{-19}` Coulomb)
- ``FluidVelDummy`` -- vector providing a dummy fluid velocity field (for advection component)
- ``ElectricFieldDummy`` -- vectory providing a dummy electric field (for force component)
- ``tauList`` -- list of relaxation times to set the diffusion coefficient based on :math:`\lambda_k`.
- ``IonDiffusivityList`` -- list of physical ion diffusivities in units :math:`\mbox{m}^2/\mbox{second}`.
- ``IonValenceList`` -- list of ion valence charges for each ion in the model.
- ``IonConcentrationList`` -- list of concentrations to set for each ion.
- ``MembraneIonConcentrationList`` -- list of concentrations to set for each ion inside the membrane.
- ``BC_InletList`` -- boundary conditions for each ion at the z-inlet (``0`` for periodic, ``1`` to set concentration)
- ``BC_OutletList`` -- boundary conditions for each ion at the z-outlet
- ``InletValueList`` -- concentration value to set at the inlet (if not periodic)
- ``OutletValueList`` -- concentration value to set at the outlet (if not periodic)
*********************
Gauss's Law Model
*********************
The LBPM Gauss's law solver is designed to solve for the electric field in an ionic fluid.
.. math::
:nowrap:
$$
\nabla^2_{fe} \psi (\mathbf{x}_i) = \frac{1}{6 \Delta x^2}
\Bigg( 2 \sum_{q=1}^{6} \psi(\mathbf{x}_i + \bm{\xi}_q \Delta t)
+ \sum_{q=7}^{18} \psi(\mathbf{x}_i + \bm{\xi}_q \Delta t)
- 24 \psi (\mathbf{x}_i) \Bigg) \;,
$$
The equilibrium functions are defined as
.. math::
:nowrap:
$$
g_q^{eq} = w_q \psi\;,
$$
where :math:`w_0=1/2`, :math:`w_q=1/24` for :math:`q=1,\ldots,6` and :math:`w_q=1/48` for :math:`q=7,\ldots,18`
which implies that
.. math::
:nowrap:
$$
\psi = \sum_{q=0}^{Q} g_q^{eq}\;.
$$
Given a particular initial condition for :math:`\psi`, let us consider application of the standard D3Q19 streaming step based on the equilibrium distributions
.. math::
:nowrap:
$$
g_q^\prime(\mathbf{x}, t) = g_q^{eq}(\mathbf{x}-\bm{\xi}_q\Delta t, t+ \Delta t)\;.
$$
Relative to the solution of Gauss's law, the error is given by
.. math::
:nowrap:
$$
\varepsilon_{\psi} =
8 \Big[ -g_0 + \sum_{q=1}^Q g_q^\prime(\mathbf{x}, t) \Big]
+ \frac{\rho_e}{\epsilon_r \epsilon_0} \;.
$$
Using the fact that :math:`f_0 = W_0 \psi`, we can compute the value
:math:`\psi^\prime` that would kill the error. We set :math:`\varepsilon_{\psi}=0`
and rearrange terms to obtain
.. math::
:nowrap:
$$
\psi^\prime (\mathbf{x},t) = \frac{1}{W_0}\Big[ \sum_{q=1}^Q g_q^\prime(\mathbf{x}, t)
+ \frac{1}{8}\frac{\rho_e}{\epsilon_r \epsilon_0}\Big] \;.
$$
The local value of the potential is then updated based on a relaxation scheme, which is controlled by the relaxation time :math:`\tau_\psi`
.. math::
:nowrap:
$$
\psi(\mathbf{x},t+\Delta t) \leftarrow \Big(1 - \frac{1}{\tau_\psi} \Big )\psi (\mathbf{x},t)
+ \frac{1}{\tau_\psi} \psi^\prime (\mathbf{x},t)\;.
$$
The algorithm can then proceed to the next timestep.
Keys to control the Gauss's law solver are specified in the ``Poisson`` section of the input database.
Supported keys are:
- ``Restart`` -- read electric potential from a restart file (default ``false``)
- ``timestepMax`` -- maximum number of timesteps to run before exiting
- ``tau`` -- relaxation time
- ``analysis_interval`` -- how often to check solution for steady state
- ``tolerance`` -- controls the required accuracy
- ``epsilonR`` -- controls the electric permittivity
- ``WriteLog`` -- write a convergence log
***************************
Membrane Model
***************************
The LBPM membrane model provides the basis to model cellular dynamics.
There are currently two supported ways to specify the membrane location:
1. provide a segemented image that is labeled to differentiate the cell
interior and exterior. See the script ``NaCl-cell.py`` and input file ``NaCl.db`` as a reference for how to use labeled images.
- ``IonConcentrationFile`` -- list of files that specify the initial concentration for each ion
- ``Filename`` -- 8-bit binary file provided in the ``Domain`` section of the input database
- ``ReadType`` -- this should be ``"8bit"`` (this is the default)
2. provide a ``.swc`` file that specifies the geometry (see example input file below).
- ``Filename`` -- swc file name should be provided in the ``Domain`` section of the input database
- ``ReadType`` -- this should be ``"swc"`` (required since ``"8bit"`` is the internal default)
Example input files for both cases are stored within the LBPM repository, located at ``example/SingleCell/``
The membrane simply prevents the diffusion of ions. All lattice links crossing the membrane are stored in a dedicated data structure so that transport is decoupled from the bulk regions. Suppose that site :math:`\mathbf{x}_{q\ell}` is inside the membrane and :math:`\mathbf{x}_{p\ell}` is outside the membrane, with :math:`\mathbf{x}_{p \ell } = \mathbf{x}_{q\ell} + \bm{\xi}_q \Delta t`. For each species :math:`k`, transport across each link :math:`\ell` is controlled by a pair of coefficients, :math:`\alpha^k_{\ell p}` and :math:`\alpha^k_{\ell q}`. Ions transported from the outside to the inside are transported by the particular distribution that is associated with the direction :math:`\xi_q`
.. math::
:nowrap:
$$
{ f_{q}^{k \prime} (\mathbf{x}_{q \ell}) \gets (1-\alpha^k_{\ell q}) f_{q}^{k} (\mathbf{x}_{q\ell}) + \alpha^k_{\ell p } f_{ p}^{k} (\mathbf{x}_{p\ell})}
$$
Similarly, for ions transported from the inside to the outside
.. math::
:nowrap:
$$
{f_{p}^{k \prime} (\mathbf{x}_{p\ell}) \gets (1-\alpha^k_{\ell p}) f_{p}^{k} (\mathbf{x}_{p\ell}) + \alpha^k_{\ell q } f_{q}^{k} (\mathbf{x}_{q\ell})}
$$
The basic closure relationship that is implemented is for voltage-gated ion channels.
Let :math:`\Delta \psi_\ell = \psi(\mathbf{x}_{p\ell} ,t) - \psi(\mathbf{x}_{q\ell},t)` be the membrane potential across link :math:`\ell`. Since :math:`\psi` is determined based on the charge density, :math:`\Delta \psi_\ell` can vary with both space and time. The behavior of the gate is implmented as follows,
.. math::
:nowrap:
$$
\Delta \psi_\ell > \tilde{V}_m\; \Rightarrow \; \mbox{gate is open} \; \Rightarrow \; \alpha^{k}_{q \ell} = \alpha_{1} + \alpha_2\;,
$$
and
.. math::
:nowrap:
$$
\Delta \psi_\ell \le \tilde{V}_m\; \Rightarrow \; \mbox{gate is closed}\; \Rightarrow \; \alpha^{{k}}_{q \ell} = \alpha_1\;
$$
where :math:`\tilde{V}_m` is the membrane voltage threshold that controls gate. Mass conservation dictates that
.. math::
:nowrap:
$$
\alpha_1 \ge 0\;, \quad \alpha_2 \ge 0\;, \quad \alpha_1 + \alpha_2 \le 1\;.
$$
The rule is enforced based on the Heaviside function, as follows
.. math::
:nowrap:
$$
\alpha_{\ell q}^{k} (\Delta \psi_\ell) = \alpha_1 + \alpha_2 H\big(\Delta \psi_\ell - \tilde{V}_m \big)\;.
$$
Note that different coefficients are specified for each ion in the model.
Keys for the membrane model are set in the ``Membrane`` section of the input file database. Supported keys are
- ``VoltageThreshold`` -- voltage threshold (may be different for each ion)
- ``MassFractionIn`` -- value of :math:`\alpha^k_{\ell p}` when the voltage threshold is not met
- ``MassFractionOut`` -- value of :math:`\alpha^k_{\ell q}` when the voltage threshold is not met
- ``ThresholdMassFractionIn`` -- value of :math:`\alpha^k_{\ell p}` when the voltage threshold is met
- ``ThresholdMassFractionOut`` -- value of :math:`\alpha^k_{\ell q}` when the voltage threshold is met
****************************
Example Input File
****************************
.. code-block:: c
MultiphysController {
timestepMax = 25000
num_iter_Ion_List = 4
analysis_interval = 100
tolerance = 1.0e-9
visualization_interval = 1000 // Frequency to write visualization data
}
Ions {
use_membrane = true
Restart = false
MembraneIonConcentrationList = 150.0e-3, 10.0e-3, 15.0e-3, 155.0e-3 //user-input unit: [mol/m^3]
temperature = 293.15 //unit [K]
number_ion_species = 4 //number of ions
tauList = 1.0, 1.0, 1.0, 1.0
IonDiffusivityList = 1.0e-9, 1.0e-9, 1.0e-9, 1.0e-9 //user-input unit: [m^2/sec]
IonValenceList = 1, -1, 1, -1 //valence charge of ions; dimensionless; positive/negative integer
IonConcentrationList = 4.0e-3, 20.0e-3, 16.0e-3, 0.0e-3 //user-input unit: [mol/m^3]
BC_Solid = 0 //solid boundary condition; 0=non-flux BC; 1=surface ion concentration
//SolidLabels = 0 //solid labels for assigning solid boundary condition; ONLY for BC_Solid=1
//SolidValues = 1.0e-5 // user-input surface ion concentration unit: [mol/m^2]; ONLY for BC_Solid=1
FluidVelDummy = 0.0, 0.0, 0.0 // dummy fluid velocity for debugging
BC_InletList = 0, 0, 0, 0
BC_OutletList = 0, 0, 0, 0
}
Poisson {
lattice_scheme = "D3Q19"
epsilonR = 78.5 //fluid dielectric constant [dimensionless]
BC_Inlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
BC_Outlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
//--------------------------------------------------------------------------
//--------------------------------------------------------------------------
BC_Solid = 2 //solid boundary condition; 1=surface potential; 2=surface charge density
SolidLabels = 0 //solid labels for assigning solid boundary condition
SolidValues = 0 //if surface potential, unit=[V]; if surface charge density, unit=[C/m^2]
WriteLog = true //write convergence log for LB-Poisson solver
// ------------------------------- Testing Utilities ----------------------------------------
// ONLY for code debugging; the followings test sine/cosine voltage BCs; disabled by default
TestPeriodic = false
TestPeriodicTime = 1.0 //unit:[sec]
TestPeriodicTimeConv = 0.01 //unit:[sec]
TestPeriodicSaveInterval = 0.2 //unit:[sec]
//------------------------------ advanced setting ------------------------------------
timestepMax = 4000 //max timestep for obtaining steady-state electrical potential
analysis_interval = 25 //timestep checking steady-state convergence
tolerance = 1.0e-10 //stopping criterion for steady-state solution
InitialValueLabels = 1, 2
InitialValues = 0.0, 0.0
}
Domain {
Filename = "Bacterium.swc"
nproc = 2, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 64, 64, 64 // Size of local domain (Nx,Ny,Nz)
N = 128, 64, 64 // size of the input image
voxel_length = 0.01 //resolution; user-input unit: [um]
BC = 0 // Boundary condition type
ReadType = "swc"
ReadValues = 0, 1, 2
WriteValues = 0, 1, 2
}
Analysis {
analysis_interval = 100
subphase_analysis_interval = 50 // Frequency to perform analysis
restart_interval = 5000 // Frequency to write restart data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 4 // Number of threads to use
load_balance = "independent" // Load balance method to use: "none", "default", "independent"
}
Visualization {
save_electric_potential = true
save_concentration = true
save_velocity = false
}
Membrane {
MembraneLabels = 2
VoltageThreshold = 0.0, 0.0, 0.0, 0.0
MassFractionIn = 1e-1, 1.0, 5e-3, 0.0
MassFractionOut = 1e-1, 1.0, 5e-3, 0.0
ThresholdMassFractionIn = 1e-1, 1.0, 5e-3, 0.0
ThresholdMassFractionOut = 1e-1, 1.0, 5e-3, 0.0
}

View File

@@ -176,42 +176,70 @@ The non-zero equilibrium moments are defined as
:nowrap:
$$
m_1^{eq} = (j_x^2+j_y^2+j_z^2) - \alpha |\textbf{C}|, \\
m_1^{eq} = 19\frac{ j_x^2+j_y^2+j_z^2}{\rho_0} - 11\rho - 19 \alpha |\textbf{C}|, \\
$$
.. math::
:nowrap:
$$
m_2^{eq} = 3\rho - \frac{11( j_x^2+j_y^2+j_z^2)}{2\rho_0}, \\
$$
.. math::
:nowrap:
$$
m_4^{eq} = -\frac{2 j_x}{3}, \\
$$
.. math::
:nowrap:
$$
m_6^{eq} = -\frac{2 j_y}{3}, \\
$$
.. math::
:nowrap:
$$
m_8^{eq} = -\frac{2 j_z}{3}, \\
$$
.. math::
:nowrap:
$$
m_9^{eq} = (2j_x^2-j_y^2-j_z^2)+ \alpha \frac{|\textbf{C}|}{2}(2n_x^2-n_y^2-n_z^2), \\
m_9^{eq} = \frac{2j_x^2-j_y^2-j_z^2}{\rho_0}+ \alpha \frac{|\textbf{C}|}{2}(2n_x^2-n_y^2-n_z^2), \\
$$
.. math::
:nowrap:
$$
m_{11}^{eq} = (j_y^2-j_z^2) + \alpha \frac{|\textbf{C}|}{2}(n_y^2-n_z^2), \\
m_{11}^{eq} = \frac{j_y^2-j_z^2}{\rho_0} + \alpha \frac{|\textbf{C}|}{2}(n_y^2-n_z^2), \\
$$
.. math::
:nowrap:
$$
m_{13}^{eq} = j_x j_y + \alpha \frac{|\textbf{C}|}{2} n_x n_y\;, \\
m_{13}^{eq} = \frac{j_x j_y}{\rho_0} + \alpha \frac{|\textbf{C}|}{2} n_x n_y\;, \\
$$
.. math::
:nowrap:
$$
m_{14}^{eq} = j_y j_z + \alpha \frac{|\textbf{C}|}{2} n_y n_z\;, \\
m_{14}^{eq} = \frac{j_y j_z}{\rho_0} + \alpha \frac{|\textbf{C}|}{2} n_y n_z\;, \\
$$
.. math::
:nowrap:
$$
m_{15}^{eq} = j_x j_z + \alpha \frac{|\textbf{C}|}{2} n_x n_z\;,
m_{15}^{eq} = \frac{j_x j_z}{\rho_0} + \alpha \frac{|\textbf{C}|}{2} n_x n_z\;.
$$
where the color gradient is determined from the phase indicator field

View File

@@ -120,7 +120,7 @@ two fluids are permitted to freely mix between the endpoints. Beyond the endpoin
term is used to drive spontaneous imbibition into the grey voxels
..math::
.. math::
:nowrap:
$$
@@ -241,46 +241,75 @@ The relaxation parameters are determined from the relaxation time:
The non-zero equilibrium moments are defined as
.. math::
:nowrap:
$$
m_1^{eq} = (j_x^2+j_y^2+j_z^2) - \alpha |\textbf{C}|, \\
m_1^{eq} = 19\frac{ j_x^2+j_y^2+j_z^2}{\rho_0} - 11\rho - 19 \alpha |\textbf{C}|, \\
$$
.. math::
:nowrap:
$$
m_2^{eq} = 3\rho - \frac{11( j_x^2+j_y^2+j_z^2)}{2\rho_0}, \\
$$
.. math::
:nowrap:
$$
m_4^{eq} = -\frac{2 j_x}{3}, \\
$$
.. math::
:nowrap:
$$
m_6^{eq} = -\frac{2 j_y}{3}, \\
$$
.. math::
:nowrap:
$$
m_8^{eq} = -\frac{2 j_z}{3}, \\
$$
.. math::
:nowrap:
$$
m_9^{eq} = (2j_x^2-j_y^2-j_z^2)+ \alpha \frac{|\textbf{C}|}{2}(2n_x^2-n_y^2-n_z^2), \\
m_9^{eq} = \frac{2j_x^2-j_y^2-j_z^2}{\rho_0}+ \alpha \frac{|\textbf{C}|}{2}(2n_x^2-n_y^2-n_z^2), \\
$$
.. math::
:nowrap:
$$
m_{11}^{eq} = (j_y^2-j_z^2) + \alpha \frac{|\textbf{C}|}{2}(n_y^2-n_z^2), \\
m_{11}^{eq} = \frac{j_y^2-j_z^2}{\rho_0} + \alpha \frac{|\textbf{C}|}{2}(n_y^2-n_z^2), \\
$$
.. math::
:nowrap:
$$
m_{13}^{eq} = j_x j_y + \alpha \frac{|\textbf{C}|}{2} n_x n_y\;, \\
m_{13}^{eq} = \frac{j_x j_y}{\rho_0} + \alpha \frac{|\textbf{C}|}{2} n_x n_y\;, \\
$$
.. math::
:nowrap:
$$
m_{14}^{eq} = j_y j_z + \alpha \frac{|\textbf{C}|}{2} n_y n_z\;, \\
m_{14}^{eq} = \frac{j_y j_z}{\rho_0} + \alpha \frac{|\textbf{C}|}{2} n_y n_z\;, \\
$$
.. math::
:nowrap:
$$
m_{15}^{eq} = j_x j_z + \alpha \frac{|\textbf{C}|}{2} n_x n_z\;,
m_{15}^{eq} = \frac{j_x j_z}{\rho_0} + \alpha \frac{|\textbf{C}|}{2} n_x n_z\;.
$$
where the color gradient is determined from the phase indicator field

View File

@@ -12,9 +12,7 @@ Currently supported lattice Boltzmann models
mrt/*
nernstPlanck/*
PoissonBoltzmann/*
cell/*
greyscale/*

View File

@@ -225,4 +225,5 @@ Example Input File
InletLayers = 0, 0, 10 // specify 10 layers along the z-inlet
BC = 0 // boundary condition type (0 for periodic)
}
Visualization {
}

View File

@@ -1,6 +0,0 @@
=============================================
Nernst-Planck model
=============================================
The Nernst-Planck model is designed to model ion transport based on the
Nernst-Planck equation.

View File

@@ -0,0 +1,18 @@
import numpy as np
import matplotlib.pylab as plt
D=np.ones((40,40,40),dtype="uint8")
cx = 20
cy = 20
cz = 20
for i in range(0, 40):
for j in range (0, 40):
for k in range (0,40):
dist = np.sqrt((i-cx)*(i-cx) + (j-cx)*(j-cx) + (k-cz)*(k-cz))
if (dist < 12.5 ) :
D[i,j,k] = 2
D.tofile("bubble_40x40x40.raw")

View File

@@ -0,0 +1,77 @@
import numpy as np
import matplotlib.pylab as plt
D=np.ones((40,40,40),dtype="uint8")
cx = 20
cy = 20
cz = 20
for i in range(0, 40):
for j in range (0, 40):
for k in range (0,40):
dist = np.sqrt((i-cx)*(i-cx) + (j-cx)*(j-cx) + (k-cz)*(k-cz))
if (dist < 15.5 ) :
D[i,j,k] = 2
D.tofile("cell_40x40x40.raw")
C1=np.zeros((40,40,40),dtype="double")
C2=np.zeros((40,40,40),dtype="double")
C3=np.zeros((40,40,40),dtype="double")
C4=np.zeros((40,40,40),dtype="double")
C5=np.zeros((40,40,40),dtype="double")
C6=np.zeros((40,40,40),dtype="double")
for i in range(0, 40):
for j in range (0, 40):
for k in range (0,40):
#outside the cell
C1[i,j,k] = 4.0e-6 # K
C2[i,j,k] = 150.0e-6 # Na
C3[i,j,k] = 116.0e-6 # Cl
C4[i,j,k] = 29.0e-6 # HC03
#C5[i,j,k] = 2.4e-6 # Ca
dist = np.sqrt((i-cx)*(i-cx) + (j-cx)*(j-cx) + (k-cz)*(k-cz))
# inside the cell
if (dist < 15.5 ) :
C1[i,j,k] = 145.0e-6
C2[i,j,k] = 12.0e-6
C3[i,j,k] = 4.0e-6
C4[i,j,k] = 12.0e-6 # 12 mmol / L
#C5[i,j,k] = 0.10e-6 # 100 nmol / L
# add up the total charge to make sure it is zero
TotalCharge = 0
for i in range(0, 40):
for j in range (0, 40):
for k in range (0,40):
TotalCharge += C1[i,j,k] + C2[i,j,k] - C3[i,j,k] - C4[i,j,k]
TotalCharge /= (40*40*40)
print("Total charge " + str(TotalCharge))
for i in range(0, 40):
for j in range (0, 40):
for k in range (0,40):
if TotalCharge < 0 :
# need more cation
C5[i,j,k] = abs(TotalCharge)
C6[i,j,k] = 0.0
else :
# need more anion
C5[i,j,k] = 0.0
C6[i,j,k] = abs(TotalCharge)
C1.tofile("cell_concentration_K_40x40x40.raw")
C2.tofile("cell_concentration_Na_40x40x40.raw")
C3.tofile("cell_concentration_Cl_40x40x40.raw")
C4.tofile("cell_concentration_HCO3_40x40x40.raw")
C5.tofile("cell_concentration_cation_40x40x40.raw")
C6.tofile("cell_concentration_anion_40x40x40.raw")

75
example/Bubble/cell.db Normal file
View File

@@ -0,0 +1,75 @@
MultiphysController {
timestepMax = 60
num_iter_Ion_List = 2
analysis_interval = 50
tolerance = 1.0e-9
visualization_interval = 100 // Frequency to write visualization data
analysis_interval = 50 // Frequency to perform analysis
}
Stokes {
tau = 1.0
F = 0, 0, 0
ElectricField = 0, 0, 0 //body electric field; user-input unit: [V/m]
nu_phys = 0.889e-6 //fluid kinematic viscosity; user-input unit: [m^2/sec]
}
Ions {
IonConcentrationFile = "cell_concentration_K_40x40x40.raw", "double", "cell_concentration_Na_40x40x40.raw", "double", "cell_concentration_Cl_40x40x40.raw", "double", "cell_concentration_HCO3_40x40x40.raw", "double", "cell_concentration_anion_40x40x40.raw", "double", "cell_concentration_cation_40x40x40.raw", "double"
temperature = 293.15 //unit [K]
number_ion_species = 6 //number of ions
tauList = 1.0, 1.0, 1.0, 1.0, 1.0, 1.0
IonDiffusivityList = 1.0e-9, 1.0e-9, 1.0e-9, 1.0e-9, 1.0e-9, 1.0e-9 //user-input unit: [m^2/sec]
IonValenceList = 1, 1, -1, -1, 1, -1 //valence charge of ions; dimensionless; positive/negative integer
IonConcentrationList = 1.0e-6, 1.0e-6, 1.0e-6, 1.0e-6, 1.0e-6, 1.0e-6 //user-input unit: [mol/m^3]
BC_Solid = 0 //solid boundary condition; 0=non-flux BC; 1=surface ion concentration
//SolidLabels = 0 //solid labels for assigning solid boundary condition; ONLY for BC_Solid=1
//SolidValues = 1.0e-5 // user-input surface ion concentration unit: [mol/m^2]; ONLY for BC_Solid=1
FluidVelDummy = 0.0, 0.0, 1.0e-2 // dummy fluid velocity for debugging
}
Poisson {
epsilonR = 78.5 //fluid dielectric constant [dimensionless]
BC_Inlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
BC_Outlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
//--------------------------------------------------------------------------
//--------------------------------------------------------------------------
BC_Solid = 2 //solid boundary condition; 1=surface potential; 2=surface charge density
SolidLabels = 0 //solid labels for assigning solid boundary condition
SolidValues = 0 //if surface potential, unit=[V]; if surface charge density, unit=[C/m^2]
WriteLog = true //write convergence log for LB-Poisson solver
// ------------------------------- Testing Utilities ----------------------------------------
// ONLY for code debugging; the followings test sine/cosine voltage BCs; disabled by default
TestPeriodic = false
TestPeriodicTime = 1.0 //unit:[sec]
TestPeriodicTimeConv = 0.01 //unit:[sec]
TestPeriodicSaveInterval = 0.2 //unit:[sec]
//------------------------------ advanced setting ------------------------------------
timestepMax = 100000 //max timestep for obtaining steady-state electrical potential
analysis_interval = 200 //timestep checking steady-state convergence
tolerance = 1.0e-6 //stopping criterion for steady-state solution
}
Domain {
Filename = "cell_40x40x40.raw"
nproc = 1, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 40, 40, 40 // Size of local domain (Nx,Ny,Nz)
N = 40, 40, 40 // size of the input image
voxel_length = 1.0 //resolution; user-input unit: [um]
BC = 0 // Boundary condition type
ReadType = "8bit"
ReadValues = 0, 1, 2
WriteValues = 0, 1, 2
}
Analysis {
analysis_interval = 100
subphase_analysis_interval = 50 // Frequency to perform analysis
restart_interval = 5000 // Frequency to write restart data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 4 // Number of threads to use
load_balance = "independent" // Load balance method to use: "none", "default", "independent"
}
Visualization {
save_electric_potential = true
save_concentration = true
save_velocity = true
}
Membrane {
MembraneLabels = 2
}

View File

@@ -3,6 +3,8 @@
INSTALL_EXAMPLE( Bubble )
INSTALL_EXAMPLE( ConstrainedBubble )
INSTALL_EXAMPLE( Piston )
INSTALL_EXAMPLE( Droplet )
INSTALL_EXAMPLE( DropletCoalescence )
INSTALL_EXAMPLE( Plates )
INSTALL_EXAMPLE( SquareTube )
INSTALL_EXAMPLE( InkBottle )

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,20 @@
import numpy as np
import matplotlib.pylab as plt
D=np.ones((80,80,40),dtype="uint8")
cx = 40
cy = 40
cz = 0
for i in range(0,80):
for j in range (0, 80):
for k in range (0,40):
dist = np.sqrt((i-cx)*(i-cx) + (j-cx)*(j-cx) + (k-cz)*(k-cz))
if (dist < 25) :
D[i,j,k] = 2
if k < 4 :
D[i,j,k] = 0
D.tofile("droplet_40x80x80.raw")

14
example/Droplet/input.db Normal file
View File

@@ -0,0 +1,14 @@
Domain {
Filename = "droplet_40x80x80.raw"
nproc = 1, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 40, 80, 80 // Size of local domain (Nx,Ny,Nz)
N = 40, 80, 80 // size of the input image
voxel_length = 1.0
BC = 0 // Boundary condition type
ReadType = "8bit"
ReadValues = 0, 2, 1
WriteValues = 0, 2, 1
}
Visualization {
}

View File

@@ -0,0 +1,117 @@
MultiphysController {
timestepMax = 20000
visualization_interval = 1000 // Frequency to write visualization data
analysis_interval = 20 // Frequency to perform analysis
}
Stokes {
epsilonR = 78.5 //fluid dielectric constant [dimensionless]
tau = 1.0
F = 0, 0, 0
rho_phys = 998.2
nu_phys = 1.003e-6 //fluid kinematic viscosity; user-input unit: [m^2/sec]
BC = 3 // Pressure constant BC
din = 1.0 // Inlet pressure
dout = 1.0 // Outlet pressure
UseElectroosmoticVelocityBC = true
SolidLabels = 0, -1
ZetaPotentialSolidList = -0.005, -0.03 // unit [v]
}
Ions {
temperature = 310.15 //unit [K]
//number_ion_species = 5 //number of ions
//tauList = 1.0, 1.0, 1.0, 1.0, 1.0 // H+, OH-, Na+, Cl-, Fe3+
//IonDiffusivityList = 9.3e-9, 5.3e-9, 1.3e-9, 2.0e-9, 0.604e-9 //user-input unit: [m^2/sec]
//IonValenceList = 1, -1, 1, -1, 3 //valence charge of ions; dimensionless; positive/negative integer
//IonConcentrationList = 1.0e-4, 1.0e-4, 100, 100, 0 //user-input unit: [mol/m^3]
number_ion_species = 2 //number of ions
//IonConcentrationFile = "Pseudo3D_plane_membrane_concentration_Na_z192_xy64.raw", "double", "Pseudo3D_plane_membrane_concentration_Na_z192_xy64.raw", "double"
tauList = 1.0,1.0 // Na+, anion
IonDiffusivityList = 1e-9,1e-9 //user-input unit: [m^2/sec]
IonValenceList = 1,-1 //valence charge of ions; dimensionless; positive/negative integer
IonConcentrationList = 145e-3,145e-3 //user-input unit: [mol/m^3]
MembraneIonConcentrationList = 15e-3, 15e-3
BC_InletList = 0,0 //boundary condition for inlet; 0=periodic; 1=ion concentration; 2=ion flux
BC_OutletList = 0,0 //boundary condition for outlet; 0=periodic; 1=ion concentration; 2=ion flux
InletValueList = 15e-3, 15e-3 //if ion concentration unit=[mol/m^3]; if flux (inward) unit=[mol/m^2/sec]
OutletValueList = 145e-3, 145e-3 //if ion concentration unit=[mol/m^3]; if flux (inward) unit=[mol/m^2/sec]
BC_Solid = 0 //solid boundary condition; 0=non-flux BC; 1=surface ion concentration
//SolidLabels = 0 olid labels for assigning solid boundary condition; ONLY for BC_Solid=1
//SolidValues = 1.0e-5 // user-input surface ion concentration unit: [mol/m^2]; ONLY for BC_Solid=1
FluidVelDummy = 0.0, 0.0, 0.0 // dummy fluid velocity for debugging
}
Poisson {
epsilonR = 80.4 //fluid dielectric constant [dimensionless]
tau = 4.5
BC_Inlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
BC_Outlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
InitialValueLabels = 1,2//a list of labels of fluid nodes
InitialValues = 60.6e-3, 0 //unit: [V]
//------- Boundary Voltage for BC = 1 (Inlet & Outlet) ---------------------
Vin = 60.6e-3 //ONLY for BC_Inlet = 1; electrical potential at inlet
Vout = 0 //ONLY for BC_Outlet = 1; electrical potential at outlet
//--------------------------------------------------------------------------
//------- Boundary Voltage for BC = 2 (Inlet & Outlet) ---------------------
//Vin0 = 0.01 //(ONLY for BC_Inlet = 2); unit:[Volt]
//freqIn = 1.0 //(ONLY for BC_Inlet = 2); unit:[Hz]
//t0_In = 0.0 //(ONLY for BC_Inlet = 1); unit:[sec]
//Vin_Type = 1 //(ONLY for BC_Inlet = 1); 1->sin(); 2->cos()
//Vout0 = 0.01 //(ONLY for BC_Outlet = 1); unit:[Volt]
//freqOut = 1.0 //(ONLY for BC_Outlet = 1); unit:[Hz]
//t0_Out = 0.0 //(ONLY for BC_Outlet = 1); unit:[sec]
//Vout_Type = 1 //(ONLY for BC_Outlet = 1); 1->sin(); 2->cos()
//--------------------------------------------------------------------------
BC_SolidList = 1 //solid boundary condition; 1=surface potential; 2=surface charge density
SolidLabels = 0 //solid labels for assigning solid boundary condition
SolidValues = -0.001 //if surface potential, unit=[V]; if surface charge density, unit=[C/m^2]
WriteLog = true //write convergence log for LB-Poisson solver
// ------------------------------- Testing Utilities ----------------------------------------
// ONLY for code debugging; the followings test sine/cosine voltage BCs; disabled by default
TestPeriodic = false
TestPeriodicTime = 1.0 //unit:[sec]
TestPeriodicTimeConv = 0.01 //unit:[sec]
TestPeriodicSaveInterval = 0.2 //unit:[sec]
//------------------------------ advanced setting ------------------------------------
timestepMax = 10000 //max timestep for obtaining steady-state electrical potential
analysis_interval = 200 //timestep checking steady-state convergence
tolerance = 1.0e-6 //stopping criterion for steady-state solution
}
Membrane {
MembraneLabels = 1
VoltageThreshold = 100.0, 100.0
MassFractionIn = 1,0
MassFractionOut = 1,0
ThresholdMassFractionIn = 1, 0
ThresholdMassFractionOut = 1, 0
}
Domain {
Filename = "Pseudo3D_double_plane_membrane_z192_xy64_InsideLabel1_OutsideLabel2.raw"
nproc = 1, 1, 3 // Number of processors (Npx,Npy,Npz)
n = 64, 64, 64 // Size of local domain (Nx,Ny,Nz)
N = 64, 64, 192 // size of the input image
voxel_length = 0.01 //resolution; user-input unit: [um]
BC = 0 // Boundary condition type0
ReadType = "8bit"
ReadValues = 2, 1
WriteValues = 2, 1
//InletLayers = 0, 0, 1
//OutletLayers = 0, 0, 1
//InletLayersPhase = 1
//OutletLayersPhase = 1
//checkerSize = 3 // size of the checker to use
}
Analysis {
}
Visualization {
save_electric_potential = true
save_concentration = true
#save_velocity = true
#save_pressure = true
save_8bit_raw = true
}

View File

@@ -0,0 +1,90 @@
import numpy as np
import math
import matplotlib.pyplot as plt
#physical constant
k_B_const = 1.380649e-23 #[J/K]
N_A_const = 6.02214076e23 #[1/mol]
e_const = 1.602176634e-19 #[C]
epsilon0_const = 8.85418782e-12 #[C/V/m]
#other material property parameters
epsilonr_water = 80.4
T=310.15 #[K]
#input ion concentration
C_Na_in = 15e-3 #[mol/m^3]
C_Na_out = 145e-3 #[mol/m^3]
C_K_in = 150e-3 #[mol/m^3]
C_K_out = 4e-3 #[mol/m^3]
C_Cl_in = 10e-3 #[mol/m^3]
C_Cl_out = 110e-3 #[mol/m^3]
#calculating Debye length
#For the definition of Debye lenght in electrolyte solution, see:
#DOI:10.1016/j.cnsns.2014.03.005
#Eq(42) in Yoshida etal., Coupled LB method for simulator electrokinetic flows
prefactor= math.sqrt(epsilonr_water*epsilon0_const*k_B_const*T/2.0/N_A_const/e_const**2)
debye_length_in = prefactor*np.sqrt(np.array([1.0/C_Na_in,1.0/C_K_in,1.0/C_Cl_in]))
debye_length_out = prefactor*np.sqrt(np.array([1.0/C_Na_out,1.0/C_K_out,1.0/C_Cl_out]))
print("Debye length inside membrane in [m]")
print(debye_length_in)
print("Debye length outside membrane in [m]")
print(debye_length_out)
#setup domain
cube_length_z = 192
cube_length_xy = 64
#set LBPM domain resoluiton
h=0.01 #[um]
print("Image resolution = %.6g [um] (= %.6g [m])"%(h,h*1e-6))
domain=2*np.ones((cube_length_z,cube_length_xy,cube_length_xy),dtype=np.int8)
zgrid,ygrid,xgrid=np.meshgrid(np.arange(cube_length_z),np.arange(cube_length_xy),np.arange(cube_length_xy),indexing='ij')
domain_centre=cube_length_xy/2
make_bubble = np.logical_and(zgrid>=cube_length_z/4,zgrid<=cube_length_z*0.75)
domain[make_bubble]=1
##save domain
file_name= "Pseudo3D_double_plane_membrane_z192_xy64_InsideLabel1_OutsideLabel2.raw"
domain.tofile(file_name)
print("save file: "+file_name)
#debug plot
#plt.figure(1)
#plt.pcolormesh(domain[:,int(domain_centre),:])
#plt.colorbar()
#plt.axis("equal")
#plt.show()
##generate initial ion concentration - 3D
#domain_Na = C_Na_out*np.ones_like(domain,dtype=np.float64)
#domain_Na[make_bubble] = C_Na_in
#domain_K = C_K_out*np.ones_like(domain,dtype=np.float64)
#domain_K[make_bubble] = C_K_in
#domain_Cl = C_Cl_out*np.ones_like(domain,dtype=np.float64)
#domain_Cl[make_bubble] = C_Cl_in
#
#domain_Na.tofile("Pseudo3D_plane_membrane_concentration_Na_z192_xy64.raw")
#domain_K.tofile("Pseudo3D_plane_membrane_concentration_K_z192_xy64.raw")
#domain_Cl.tofile("Pseudo3D_plane_membrane_concentration_Cl_z192_xy64.raw")
##debug plot
#plt.figure(2)
#plt.subplot(1,3,1)
#plt.title("Na concentration")
#plt.pcolormesh(domain_Na[:,int(bubble_centre),:])
#plt.colorbar()
#plt.axis("equal")
#plt.subplot(1,3,2)
#plt.title("K concentration")
#plt.pcolormesh(domain_K[:,int(bubble_centre),:])
#plt.colorbar()
#plt.axis("equal")
#plt.subplot(1,3,3)
#plt.title("Cl concentration")
#plt.pcolormesh(domain_Cl[:,int(bubble_centre),:])
#plt.colorbar()
#plt.axis("equal")
#plt.show()

View File

@@ -0,0 +1,86 @@
MultiphysController {
timestepMax = 25000
num_iter_Ion_List = 4
analysis_interval = 100
tolerance = 1.0e-9
visualization_interval = 1000 // Frequency to write visualization data
}
Stokes {
tau = 1.0
F = 0, 0, 0
ElectricField = 0, 0, 0 //body electric field; user-input unit: [V/m]
nu_phys = 0.889e-6 //fluid kinematic viscosity; user-input unit: [m^2/sec]
}
Ions {
MembraneIonConcentrationList = 150.0e-3, 10.0e-3, 15.0e-3, 155.0e-3 //user-input unit: [mol/m^3]
temperature = 293.15 //unit [K]
number_ion_species = 4 //number of ions
tauList = 1.0, 1.0, 1.0, 1.0
IonDiffusivityList = 1.0e-9, 1.0e-9, 1.0e-9, 1.0e-9 //user-input unit: [m^2/sec]
IonValenceList = 1, -1, 1, -1 //valence charge of ions; dimensionless; positive/negative integer
IonConcentrationList = 4.0e-3, 20.0e-3, 16.0e-3, 0.0e-3 //user-input unit: [mol/m^3]
BC_Solid = 0 //solid boundary condition; 0=non-flux BC; 1=surface ion concentration
//SolidLabels = 0 //solid labels for assigning solid boundary condition; ONLY for BC_Solid=1
//SolidValues = 1.0e-5 // user-input surface ion concentration unit: [mol/m^2]; ONLY for BC_Solid=1
FluidVelDummy = 0.0, 0.0, 0.0 // dummy fluid velocity for debugging
BC_InletList = 0, 0, 0, 0
BC_OutletList = 0, 0, 0, 0
}
Poisson {
lattice_scheme = "D3Q19"
epsilonR = 78.5 //fluid dielectric constant [dimensionless]
BC_Inlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
BC_Outlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
//--------------------------------------------------------------------------
//--------------------------------------------------------------------------
BC_Solid = 2 //solid boundary condition; 1=surface potential; 2=surface charge density
SolidLabels = 0 //solid labels for assigning solid boundary condition
SolidValues = 0 //if surface potential, unit=[V]; if surface charge density, unit=[C/m^2]
WriteLog = true //write convergence log for LB-Poisson solver
// ------------------------------- Testing Utilities ----------------------------------------
// ONLY for code debugging; the followings test sine/cosine voltage BCs; disabled by default
TestPeriodic = false
TestPeriodicTime = 1.0 //unit:[sec]
TestPeriodicTimeConv = 0.01 //unit:[sec]
TestPeriodicSaveInterval = 0.2 //unit:[sec]
//------------------------------ advanced setting ------------------------------------
timestepMax = 4000 //max timestep for obtaining steady-state electrical potential
analysis_interval = 25 //timestep checking steady-state convergence
tolerance = 1.0e-10 //stopping criterion for steady-state solution
InitialValueLabels = 1, 2
InitialValues = 0.0, 0.0
}
Domain {
Filename = "Bacterium.swc"
nproc = 2, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 64, 64, 64 // Size of local domain (Nx,Ny,Nz)
N = 128, 64, 64 // size of the input image
voxel_length = 0.01 //resolution; user-input unit: [um]
BC = 0 // Boundary condition type
ReadType = "swc"
ReadValues = 0, 1, 2
WriteValues = 0, 1, 2
}
Analysis {
analysis_interval = 100
subphase_analysis_interval = 50 // Frequency to perform analysis
restart_interval = 5000 // Frequency to write restart data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 4 // Number of threads to use
load_balance = "independent" // Load balance method to use: "none", "default", "independent"
}
Visualization {
save_electric_potential = true
save_concentration = true
save_velocity = false
}
Membrane {
MembraneLabels = 2
VoltageThreshold = 0.0, 0.0, 0.0, 0.0
MassFractionIn = 1e-1, 1.0, 5e-3, 0.0
MassFractionOut = 1e-1, 1.0, 5e-3, 0.0
ThresholdMassFractionIn = 1e-1, 1.0, 5e-3, 0.0
ThresholdMassFractionOut = 1e-1, 1.0, 5e-3, 0.0
}

View File

@@ -0,0 +1,8 @@
# id,type,x,y,z,r,pid
1 1 0.30 0.32 0.32 0.15 -1
2 1 0.35 0.32 0.32 0.16 1
3 1 0.43 0.32 0.32 0.17 2
4 1 0.60 0.32 0.32 0.18 3
5 1 0.77 0.32 0.32 0.17 4
6 1 0.85 0.32 0.32 0.16 5
7 1 0.90 0.32 0.32 0.15 6

View File

@@ -0,0 +1,38 @@
import numpy as np
import matplotlib.pylab as plt
D=np.ones((40,40,40),dtype="uint8")
cx = 20
cy = 20
cz = 20
radius = 8
for i in range(0, 40):
for j in range (0, 40):
for k in range (0,40):
dist = np.sqrt((i-cx)*(i-cx) + (j-cx)*(j-cx) + (k-cz)*(k-cz))
if (dist < radius ) :
D[i,j,k] = 2
D.tofile("cell_40x40x40.raw")
C1=np.zeros((40,40,40),dtype="double")
C2=np.zeros((40,40,40),dtype="double")
for i in range(0, 40):
for j in range (0, 40):
for k in range (0,40):
#outside the cell
C1[i,j,k] = 125.0e-6 # Na
C2[i,j,k] = 125.0e-6 # Cl
dist = np.sqrt((i-cx)*(i-cx) + (j-cx)*(j-cx) + (k-cz)*(k-cz))
# inside the cell
if (dist < radius ) :
C1[i,j,k] = 5.0e-6
C2[i,j,k] = 5.0e-6
C1.tofile("cell_concentration_Na_40x40x40.raw")
C2.tofile("cell_concentration_Cl_40x40x40.raw")

View File

@@ -0,0 +1,88 @@
MultiphysController {
timestepMax = 20
num_iter_Ion_List = 2
analysis_interval = 50
tolerance = 1.0e-9
visualization_interval = 100 // Frequency to write visualization data
analysis_interval = 50 // Frequency to perform analysis
}
Stokes {
tau = 1.0
F = 0, 0, 0
ElectricField = 0, 0, 0 //body electric field; user-input unit: [V/m]
nu_phys = 0.889e-6 //fluid kinematic viscosity; user-input unit: [m^2/sec]
}
Ions {
use_membrane = true
Restart = false
IonConcentrationFile = "cell_concentration_Na_40x40x40.raw", "double", "cell_concentration_Cl_40x40x40.raw", "double"
temperature = 293.15 //unit [K]
number_ion_species = 2 //number of ions
tauList = 1.0, 1.0
IonDiffusivityList = 1.0e-9, 1.0e-9 //user-input unit: [m^2/sec]
IonValenceList = 1, -1 //valence charge of ions; dimensionless; positive/negative integer
IonConcentrationList = 1.0e-6, 1.0e-6 //user-input unit: [mol/m^3]
BC_Solid = 0 //solid boundary condition; 0=non-flux BC; 1=surface ion concentration
//SolidLabels = 0 //solid labels for assigning solid boundary condition; ONLY for BC_Solid=1
//SolidValues = 1.0e-5 // user-input surface ion concentration unit: [mol/m^2]; ONLY for BC_Solid=1
FluidVelDummy = 0.0, 0.0, 0.0 // dummy fluid velocity for debugging
}
Poisson {
lattice_scheme = "D3Q19"
Restart = false
epsilonR = 78.5 //fluid dielectric constant [dimensionless]
BC_Inlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
BC_Outlet = 0 // ->1: fixed electric potential; ->2: sine/cosine periodic electric potential
//--------------------------------------------------------------------------
//--------------------------------------------------------------------------
BC_Solid = 2 //solid boundary condition; 1=surface potential; 2=surface charge density
SolidLabels = 0 //solid labels for assigning solid boundary condition
SolidValues = 0 //if surface potential, unit=[V]; if surface charge density, unit=[C/m^2]
WriteLog = true //write convergence log for LB-Poisson solver
// ------------------------------- Testing Utilities ----------------------------------------
// ONLY for code debugging; the followings test sine/cosine voltage BCs; disabled by default
TestPeriodic = false
TestPeriodicTime = 1.0 //unit:[sec]
TestPeriodicTimeConv = 0.01 //unit:[sec]
TestPeriodicSaveInterval = 0.2 //unit:[sec]
//------------------------------ advanced setting ------------------------------------
timestepMax = 100000 //max timestep for obtaining steady-state electrical potential
analysis_interval = 200 //timestep checking steady-state convergence
tolerance = 1.0e-6 //stopping criterion for steady-state solution
InitialValueLabels = 1, 2
InitialValues = 0.0, 0.0
}
Domain {
Filename = "cell_40x40x40.raw"
nproc = 1, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 40, 40, 40 // Size of local domain (Nx,Ny,Nz)
N = 40, 40, 40 // size of the input image
voxel_length = 1.0 //resolution; user-input unit: [um]
BC = 0 // Boundary condition type
ReadType = "8bit"
ReadValues = 0, 1, 2
WriteValues = 0, 1, 2
}
Analysis {
analysis_interval = 100
subphase_analysis_interval = 50 // Frequency to perform analysis
restart_interval = 5000 // Frequency to write restart data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 4 // Number of threads to use
load_balance = "independent" // Load balance method to use: "none", "default", "independent"
}
Visualization {
save_electric_potential = true
save_concentration = true
save_velocity = true
}
Membrane {
MembraneLabels = 2
VoltageThreshold = 0.0, 0.0
MassFractionIn = 0e-2, 5e-2
MassFractionOut = 0e-2, 5e-2
ThresholdMassFractionIn = 0e-2, 5e-2
ThresholdMassFractionOut = 0e-2, 5e-2
}

View File

@@ -0,0 +1,36 @@
#!/bin/bash
#SBATCH -A CSC380
#SBATCH -J Color-dense
#SBATCH -o %x-%j.out
#SBATCH -t 0:10:00
#SBATCH -p batch
#SBATCH -N 1
#SBATCH --exclusive
# MODULE ENVIRONMENT
module load PrgEnv-amd
module load rocm/4.5.0
module load cray-mpich
module load cray-hdf5-parallel
#module load craype-accel-amd-gfx908
## These must be set before compiling so the executable picks up GTL
export PE_MPICH_GTL_DIR_amd_gfx90a="-L${CRAY_MPICH_ROOTDIR}/gtl/lib"
export PE_MPICH_GTL_LIBS_amd_gfx90a="-lmpi_gtl_hsa"
export MPICH_GPU_SUPPORT_ENABLED=1
#export MPL_MBX_SIZE=1024000000
export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}
export LBPM_BIN=/ccs/proj/csc380/mcclurej/crusher/LBPM/bin
echo "Running Color LBM"
MYCPUBIND="--cpu-bind=verbose,map_cpu:57,33,25,1,9,17,41,49"
srun --verbose -N1 -n8 --cpus-per-gpu=8 --gpus-per-task=1 --gpu-bind=closest ${MYCPUBIND} $LBPM_BIN/lbpm_color_simulator input.db
#srun --verbose -N1 -n2 --mem-per-gpu=8g --cpus-per-gpu=1 --gpus-per-node=2 --gpu-bind=closest $LBPM_BIN/lbpm_permeability_simulator input.db
exit;

View File

@@ -0,0 +1,36 @@
#!/bin/bash
#SBATCH -A CSC380
#SBATCH -J MRT-a2
#SBATCH -o %x-%j.out
#SBATCH -t 0:10:00
#SBATCH -p batch
#SBATCH -N 1
#SBATCH --exclusive
# MODULE ENVIRONMENT
module load PrgEnv-amd
module load rocm/4.5.0
module load cray-mpich
module load cray-hdf5-parallel
#module load craype-accel-amd-gfx908
## These must be set before compiling so the executable picks up GTL
export PE_MPICH_GTL_DIR_amd_gfx90a="-L${CRAY_MPICH_ROOTDIR}/gtl/lib"
export PE_MPICH_GTL_LIBS_amd_gfx90a="-lmpi_gtl_hsa"
export MPICH_GPU_SUPPORT_ENABLED=1
#export MPL_MBX_SIZE=1024000000
export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}
export LBPM_BIN=/ccs/proj/csc380/mcclurej/crusher/LBPM/bin
echo "Running Color LBM"
MYCPUBIND="--cpu-bind=verbose,map_cpu:57,33,25,1,9,17,41,49"
srun --verbose -N1 -n8 --cpus-per-gpu=8 --gpus-per-task=1 --gpu-bind=closest ${MYCPUBIND} $LBPM_BIN/lbpm_permeability_simulator input.db
#srun --verbose -N1 -n2 --mem-per-gpu=8g --cpus-per-gpu=1 --gpus-per-node=2 --gpu-bind=closest $LBPM_BIN/lbpm_permeability_simulator input.db
exit;

View File

@@ -0,0 +1,69 @@
MRT {
timestepMax = 10000
analysis_interval = 20000
tau = 0.7
F = 0, 0, 5.0e-5
Restart = false
din = 1.0
dout = 1.0
flux = 0.0
}
Color {
tauA = 0.7;
tauB = 0.7;
rhoA = 1.0;
rhoB = 1.0;
alpha = 1e-2;
beta = 0.95;
F = 0, 0, 1.0e-5
Restart = false
flux = 0.0 // voxels per timestep
timestepMax = 10000
// rescale_force_after_timestep = 100000
ComponentLabels = 0, -1, -2
ComponentAffinity = -1.0, -1.0, -0.9
// protocol = "image sequence"
// capillary_number = 1e-5
}
Domain {
Filename = "a2_2048x2048x8192.raw"
nproc = 2, 2, 2 // Number of processors (Npx,Npy,Npz)
offset = 0, 0, 0
n = 382, 382, 382 // Size of local domain (Nx,Ny,Nz)
N = 2048, 2048, 1024 // size of the input image
voxel_length = 1.0 // Length of domain (x,y,z)
BC = 0 // Boundary condition type
//Sw = 0.2
ReadType = "8bit"
ReadValues = 0, 1, 2, -1, -2
WriteValues = 0, 1, 2, -1, -2
ComponentLabels = 0, -1, -2
InletLayers = 0, 0, 5
OutletLayers = 0, 0, 5
}
Analysis {
visualization_interval = 1000000
//morph_interval = 100000
//morph_delta = -0.08
analysis_interval = 20000 // Frequency to perform analysis
min_steady_timesteps = 15000000
max_steady_timesteps = 15000000
restart_interval = 500000 // Frequency to write restart data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 0 // Number of threads to use
load_balance = "default" // Load balance method to use: "none", "default", "independent"
}
Visualization {
save_8bit_raw = true
write_silo = true
}
FlowAdaptor {
}

View File

@@ -0,0 +1,36 @@
#!/bin/bash
#SBATCH -A CSC380
#SBATCH -J MPI-multinode
#SBATCH -o %x-%j.out
#SBATCH -t 6:00:00
#SBATCH -p batch
#SBATCH -N 8
#SBATCH --exclusive
# MODULE ENVIRONMENT
module load PrgEnv-amd
module load rocm/4.5.0
module load cray-mpich
module load cray-hdf5-parallel
#module load craype-accel-amd-gfx908
## These must be set before compiling so the executable picks up GTL
export PE_MPICH_GTL_DIR_amd_gfx90a="-L${CRAY_MPICH_ROOTDIR}/gtl/lib"
export PE_MPICH_GTL_LIBS_amd_gfx90a="-lmpi_gtl_hsa"
export MPICH_GPU_SUPPORT_ENABLED=1
#export MPL_MBX_SIZE=1024000000
export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}
export LBPM_BIN=/ccs/proj/csc380/mcclurej/crusher/LBPM/tests
echo "Running Color LBM"
MYCPUBIND="--cpu-bind=verbose,map_cpu:57"
srun --verbose -N8 -n8 --cpus-per-gpu=8 --gpus-per-task=1 --gpu-bind=closest ${MYCPUBIND} $LBPM_BIN/TestCommD3Q19 multinode.db
#srun --verbose -N1 -n2 --mem-per-gpu=8g --cpus-per-gpu=1 --gpus-per-node=2 --gpu-bind=closest $LBPM_BIN/lbpm_permeability_simulator input.db
exit;

View File

@@ -0,0 +1,36 @@
#!/bin/bash
#SBATCH -A CSC380
#SBATCH -J MPI-singlenode
#SBATCH -o %x-%j.out
#SBATCH -t 0:10:00
#SBATCH -p batch
#SBATCH -N 1
#SBATCH --exclusive
# MODULE ENVIRONMENT
module load PrgEnv-amd
module load rocm/4.5.0
module load cray-mpich
module load cray-hdf5-parallel
#module load craype-accel-amd-gfx908
## These must be set before compiling so the executable picks up GTL
export PE_MPICH_GTL_DIR_amd_gfx90a="-L${CRAY_MPICH_ROOTDIR}/gtl/lib"
export PE_MPICH_GTL_LIBS_amd_gfx90a="-lmpi_gtl_hsa"
export MPICH_GPU_SUPPORT_ENABLED=1
#export MPL_MBX_SIZE=1024000000
export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:${LD_LIBRARY_PATH}
export LBPM_BIN=/ccs/proj/csc380/mcclurej/crusher/LBPM/tests
echo "Running Color LBM"
MYCPUBIND="--cpu-bind=verbose,map_cpu:57,33,25,1,9,17,41,49"
srun --verbose -N1 -n8 --cpus-per-gpu=8 --gpus-per-task=1 --gpu-bind=closest ${MYCPUBIND} $LBPM_BIN/TestCommD3Q19 multinode.db
#srun --verbose -N1 -n2 --mem-per-gpu=8g --cpus-per-gpu=1 --gpus-per-node=2 --gpu-bind=closest $LBPM_BIN/lbpm_permeability_simulator input.db
exit;

View File

@@ -0,0 +1,9 @@
import numpy as np
N = 1024
data = np.random.randint(low=1,high=3,size=(N,N,N),dtype=np.uint8)
data.tofile("dense_1024x1024x1024.raw")

View File

@@ -0,0 +1,69 @@
MRT {
timestepMax = 100
analysis_interval = 20000
tau = 0.7
F = 0, 0, 5.0e-5
Restart = false
din = 1.0
dout = 1.0
flux = 0.0
}
Color {
tauA = 0.7;
tauB = 0.7;
rhoA = 1.0;
rhoB = 1.0;
alpha = 1e-2;
beta = 0.95;
F = 0, 0, 0.0
Restart = false
flux = 0.0 // voxels per timestep
timestepMax = 10
// rescale_force_after_timestep = 100000
ComponentLabels = 0, -1, -2
ComponentAffinity = -1.0, -1.0, -0.9
// protocol = "image sequence"
// capillary_number = 1e-5
}
Domain {
Filename = "dense_1024x1024x1024.raw"
nproc = 2, 2, 2 // Number of processors (Npx,Npy,Npz)
offset = 0, 0, 0
n = 222, 222, 222 // Size of local domain (Nx,Ny,Nz)
N = 1024, 1024, 1024 // size of the input image
voxel_length = 1.0 // Length of domain (x,y,z)
BC = 0 // Boundary condition type
//Sw = 0.2
ReadType = "8bit"
ReadValues = 0, 1, 2, -1, -2
WriteValues = 0, 1, 2, -1, -2
ComponentLabels = 0, -1, -2
InletLayers = 0, 0, 5
OutletLayers = 0, 0, 5
}
Analysis {
visualization_interval = 1000000
//morph_interval = 100000
//morph_delta = -0.08
analysis_interval = 20000 // Frequency to perform analysis
min_steady_timesteps = 15000000
max_steady_timesteps = 15000000
restart_interval = 500000 // Frequency to write restart data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 0 // Number of threads to use
load_balance = "default" // Load balance method to use: "none", "default", "independent"
}
Visualization {
save_8bit_raw = true
write_silo = true
}
FlowAdaptor {
}

View File

@@ -0,0 +1,14 @@
#!/bin/bash
#SBATCH -A CSC380
#SBATCH -J sphere_test
#SBATCH -o %x-%j.out
#SBATCH -t 00:05:00
#SBATCH -p caar
#SBATCH -N 1
module load rocm/4.2.0
export LBPM_DIR=/ccs/proj/csc380/mcclurej/spock/install/lbpm/tests
srun -n1 --ntasks-per-node=1 $LBPM_DIR/GenerateSphereTest input.db

View File

@@ -0,0 +1,17 @@
#!/bin/bash
#SBATCH -A CSC380
#SBATCH -J sphere_test
#SBATCH -o %x-%j.out
#SBATCH -t 00:05:00
#SBATCH -p caar
#SBATCH -N 1
module load rocm/4.2.0
export LBPM_DIR=/ccs/proj/csc380/mcclurej/spock/install/lbpm/tests
export MPICH_SMP_SINGLE_COPY_MODE=CMA
#srun -n1 --ntasks-per-node=1 --accel-bind=g --gpus-per-task=1 $LBPM_DIR/lbpm_color_simulator spheres322.db
srun -n1 --ntasks-per-node=1 --accel-bind=g --gpus-per-task=1 $LBPM_DIR/TestCommD3Q19 spheres322.db

View File

@@ -0,0 +1,32 @@
#!/bin/bash
#SBATCH -A CSC380
#SBATCH -J sphere_test
#SBATCH -o %x-%j.out
#SBATCH -e %x-%j.err
#SBATCH -t 00:05:00
#SBATCH -p caar
#SBATCH -N 1
module load craype-accel-amd-gfx908
module load PrgEnv-cray
#module load rocm
module load rocm/4.2.0
export LBPM_DIR=/ccs/proj/csc380/mcclurej/spock/install/lbpm/tests
#export MPICH_RDMA_ENABLED_CUDA=1
#export MPICH_ENV_DISPLAY=1
#export MPICH_GPU_SUPPORT_ENABLED=1
export MPICH_GPU_NO_ASYNC_MEMCPY=0
export MPICH_SMP_SINGLE_COPY_MODE=CMA
#export MPICH_DBG_FILENAME="./mpich-dbg.log"
export MPICH_DBG_CLASS=ALL
export MPICH_DBG_LEVEL=VERBOSE
export MPICH_DBG=yes
#export PMI_DEBUG=1
export MPIR_CVAR_GPU_EAGER_DEVICE_MEM=0
export MPICH_GPU_SUPPORT_ENABLED=1
#srun -n1 --ntasks-per-node=1 --accel-bind=g --gpus-per-task=1 $LBPM_DIR/lbpm_color_simulator spheres322.db
srun -n1 --ntasks-per-node=1 --accel-bind=g --gpus-per-task=1 --verbose --export=ALL $LBPM_DIR/TestCommD3Q19 test.db

View File

@@ -0,0 +1,54 @@
MRT {
tau = 1.0 // relaxation time
F = 0, 0, 1e-4 // external body force applied to system
timestepMax = 1000 // max number of timesteps
din = 1.0
dout = 1.0
Restart = false
flux = 0.0
}
Color {
tauA = 0.7; // relaxation time for fluid A
tauB = 0.7; // relaxation time for fluid B
rhoA = 1.0; // mass density for fluid A
rhoB = 1.0; // mass density for fluid B
alpha = 1e-3; // controls interfacial tension between fluids
beta = 0.95; // controls interface width
F = 0, 0, 1.0e-5 // external body force applied to the system
Restart = false // restart from checkpoint file?
din = 1.0 // density at inlet (if external BC is applied)
dout = 1.0 // density at outlet (if external BC is applied )
timestepMax = 10 // maximum number of timesteps to simulate
flux = 0.0 // volumetric flux in voxels per timestep (if flux BC is applied)
ComponentLabels = 0 // comma separated list of solid mineral labels
ComponentAffinity = -1.0 // comma separated list of phase indicato field value to assign for each mineral label
}
Domain {
nproc = 1, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 318, 320, 320 // Size of local domain (Nx,Ny,Nz)
N = 320, 320, 320
nspheres = 1896 // Number of spheres (only needed if using a sphere packing)
L = 1, 1, 1 // Length of domain (x,y,z)
BC = 0 // Boundary condition type
// BC = 0 for periodic BC
// BC = 1 for pressure BC (applied in z direction)
// BC = 4 for flux BC (applied in z direction
ReadType = "8bit"
ReadValues = 0, 1, 2 // list of labels within the binary file (read)
WriteValues = 0, 1, 2 // list of labels within the output files (write)
}
Analysis {
analysis_interval = 1000 // Frequency to perform analysis
restart_interval = 50000 // Frequency to write restart data
visualization_interval = 50000 // Frequency to write visualization data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 4 // Number of threads to use
load_balance = "independent" // Load balance method to use: "none", "default", "independent"
}
Visualization {
}

View File

View File

@@ -0,0 +1,54 @@
MRT {
tau = 1.0 // relaxation time
F = 0, 0, 1e-4 // external body force applied to system
timestepMax = 1000 // max number of timesteps
din = 1.0
dout = 1.0
Restart = false
flux = 0.0
}
Color {
tauA = 0.7; // relaxation time for fluid A
tauB = 0.7; // relaxation time for fluid B
rhoA = 1.0; // mass density for fluid A
rhoB = 1.0; // mass density for fluid B
alpha = 1e-3; // controls interfacial tension between fluids
beta = 0.95; // controls interface width
F = 0, 0, 1.0e-5 // external body force applied to the system
Restart = false // restart from checkpoint file?
din = 1.0 // density at inlet (if external BC is applied)
dout = 1.0 // density at outlet (if external BC is applied )
timestepMax = 10 // maximum number of timesteps to simulate
flux = 0.0 // volumetric flux in voxels per timestep (if flux BC is applied)
ComponentLabels = 0 // comma separated list of solid mineral labels
ComponentAffinity = -1.0 // comma separated list of phase indicato field value to assign for each mineral label
}
Domain {
nproc = 1, 1, 1 // Number of processors (Npx,Npy,Npz)
n = 240, 240, 240 // Size of local domain (Nx,Ny,Nz)
N = 320, 320, 320
nspheres = 1896 // Number of spheres (only needed if using a sphere packing)
L = 1, 1, 1 // Length of domain (x,y,z)
BC = 0 // Boundary condition type
// BC = 0 for periodic BC
// BC = 1 for pressure BC (applied in z direction)
// BC = 4 for flux BC (applied in z direction
ReadType = "8bit"
ReadValues = 0, 1, 2 // list of labels within the binary file (read)
WriteValues = 0, 1, 2 // list of labels within the output files (write)
}
Analysis {
analysis_interval = 1000 // Frequency to perform analysis
restart_interval = 50000 // Frequency to write restart data
visualization_interval = 50000 // Frequency to write visualization data
restart_file = "Restart" // Filename to use for restart file (will append rank)
N_threads = 4 // Number of threads to use
load_balance = "independent" // Load balance method to use: "none", "default", "independent"
}
Visualization {
}

View File

@@ -18,9 +18,11 @@
#include "hip/hip_runtime.h"
#define NBLOCKS 1024
#define NTHREADS 256
#define NTHREADS 512
__global__ void dvc_ScaLBL_D3Q19_AAeven_BGK(double *dist, int start, int finish, int Np, double rlx, double Fx, double Fy, double Fz){
__global__ void
__launch_bounds__(512,1) dvc_ScaLBL_D3Q19_AAeven_BGK(double *dist, int start, int finish, int Np, double rlx, double Fx, double Fy, double Fz){
int n;
// conserved momemnts
double rho,ux,uy,uz,uu;
@@ -138,7 +140,8 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_BGK(double *dist, int start, int finish,
}
}
__global__ void dvc_ScaLBL_D3Q19_AAodd_BGK(int *neighborList, double *dist, int start, int finish, int Np, double rlx, double Fx, double Fy, double Fz){
__global__ void
__launch_bounds__(512,1) dvc_ScaLBL_D3Q19_AAodd_BGK(int *neighborList, double *dist, int start, int finish, int Np, double rlx, double Fx, double Fy, double Fz){
int n;
// conserved momemnts
double rho,ux,uy,uz,uu;

View File

@@ -1,10 +0,0 @@
SET( HIP_SEPERABLE_COMPILATION ON )
FILE( GLOB HIP_SOURCES "*.cu" )
SET_SOURCE_FILES_PROPERTIES( ${HIP_SOURCES} PROPERTIES HIP_SOURCE_PROPERTY_FORMAT 1 )
HIP_ADD_LIBRARY( lbpm-hip ${HIP_SOURCES} SHARED HIPCC_OPTIONS ${HIP_HIPCC_OPTIONS} HCC_OPTIONS ${HIP_HCC_OPTIONS} NVCC_OPTIONS ${HIP_NVCC_OPTIONS} ${HIP_NVCC_FLAGS} )
#TARGET_LINK_LIBRARIES( lbpm-hip /opt/rocm-3.3.0/lib/libhip_hcc.so )
#TARGET_LINK_LIBRARIES( lbpm-wia lbpm-hip )
#ADD_DEPENDENCIES( lbpm-hip copy-include )

View File

@@ -21,6 +21,21 @@
#define NBLOCKS 1024
#define NTHREADS 256
__device__ __constant__ double mrt_V1=0.05263157894736842;
__device__ __constant__ double mrt_V2=0.012531328320802;
__device__ __constant__ double mrt_V3=0.04761904761904762;
__device__ __constant__ double mrt_V4=0.004594820384294068;
__device__ __constant__ double mrt_V5=0.01587301587301587;
__device__ __constant__ double mrt_V6=0.0555555555555555555555555;
__device__ __constant__ double mrt_V7=0.02777777777777778;
__device__ __constant__ double mrt_V8=0.08333333333333333;
__device__ __constant__ double mrt_V9=0.003341687552213868;
__device__ __constant__ double mrt_V10=0.003968253968253968;
__device__ __constant__ double mrt_V11=0.01388888888888889;
__device__ __constant__ double mrt_V12=0.04166666666666666;
__global__ void dvc_ScaLBL_Color_Init(char *ID, double *Den, double *Phi, double das, double dbs, int Nx, int Ny, int Nz)
{
//int i,j,k;
@@ -541,7 +556,7 @@ __global__ void dvc_ColorCollide( char *ID, double *disteven, double *distodd,
}
__global__ void
__launch_bounds__(512,2)
__launch_bounds__(256,1)
dvc_ScaLBL_D3Q19_ColorCollide( char *ID, double *disteven, double *distodd, double *phi, double *ColorGrad,
double *Velocity, int Nx, int Ny, int Nz, double rlx_setA, double rlx_setB,
double alpha, double beta, double Fx, double Fy, double Fz)
@@ -1257,7 +1272,8 @@ __global__ void dvc_ScaLBL_SetSlice_z(double *Phi, double value, int Nx, int Ny
__global__ void dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *Aq, double *Bq, double *Den, double *Phi,
__global__ void
__launch_bounds__(256,1) dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *Aq, double *Bq, double *Den, double *Phi,
double *Velocity, double rhoA, double rhoB, double tauA, double tauB, double alpha, double beta,
double Fx, double Fy, double Fz, int strideY, int strideZ, int start, int finish, int Np){
int ijk,nn,n;
@@ -1273,19 +1289,6 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *A
double ux,uy,uz;
double phi,tau,rho0,rlx_setA,rlx_setB;
const double mrt_V1=0.05263157894736842;
const double mrt_V2=0.012531328320802;
const double mrt_V3=0.04761904761904762;
const double mrt_V4=0.004594820384294068;
const double mrt_V5=0.01587301587301587;
const double mrt_V6=0.0555555555555555555555555;
const double mrt_V7=0.02777777777777778;
const double mrt_V8=0.08333333333333333;
const double mrt_V9=0.003341687552213868;
const double mrt_V10=0.003968253968253968;
const double mrt_V11=0.01388888888888889;
const double mrt_V12=0.04166666666666666;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
@@ -1295,9 +1298,10 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *A
// read the component number densities
nA = Den[n];
nB = Den[Np + n];
nAB = 1.0/(nA+nB);
// compute phase indicator field
phi=(nA-nB)/(nA+nB);
phi=(nA-nB)*nAB;
// local density
rho0=rhoA + 0.5*(1.0-phi)*(rhoB-rhoA);
@@ -1372,11 +1376,11 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *A
//...........Normalize the Color Gradient.................................
C = sqrt(nx*nx+ny*ny+nz*nz);
double ColorMag = C;
if (C==0.0) ColorMag=1.0;
nx = nx/ColorMag;
ny = ny/ColorMag;
nz = nz/ColorMag;
double iColorMag = 1.0/C;
if (C==0.0) iColorMag=1.0;
nx = nx*iColorMag;
ny = ny*iColorMag;
nz = nz*iColorMag;
// q=0
fq = dist[n];
@@ -1651,19 +1655,20 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *A
//........................................................................
//..............carry out relaxation process..............................
//..........Toelke, Fruediger et. al. 2006................................
double irho0 = 1.0/rho0;
if (C == 0.0) nx = ny = nz = 0.0;
m1 = m1 + rlx_setA*((19*(jx*jx+jy*jy+jz*jz)/rho0 - 11*rho) -19*alpha*C - m1);
m2 = m2 + rlx_setA*((3*rho - 5.5*(jx*jx+jy*jy+jz*jz)/rho0)- m2);
m1 = m1 + rlx_setA*((19*(jx*jx+jy*jy+jz*jz)*irho0 - 11*rho) -19*alpha*C - m1);
m2 = m2 + rlx_setA*((3*rho - 5.5*(jx*jx+jy*jy+jz*jz)*irho0)- m2);
m4 = m4 + rlx_setB*((-0.6666666666666666*jx)- m4);
m6 = m6 + rlx_setB*((-0.6666666666666666*jy)- m6);
m8 = m8 + rlx_setB*((-0.6666666666666666*jz)- m8);
m9 = m9 + rlx_setA*(((2*jx*jx-jy*jy-jz*jz)/rho0) + 0.5*alpha*C*(2*nx*nx-ny*ny-nz*nz) - m9);
m9 = m9 + rlx_setA*(((2*jx*jx-jy*jy-jz*jz)*irho0) + 0.5*alpha*C*(2*nx*nx-ny*ny-nz*nz) - m9);
m10 = m10 + rlx_setA*( - m10);
m11 = m11 + rlx_setA*(((jy*jy-jz*jz)/rho0) + 0.5*alpha*C*(ny*ny-nz*nz)- m11);
m11 = m11 + rlx_setA*(((jy*jy-jz*jz)*irho0) + 0.5*alpha*C*(ny*ny-nz*nz)- m11);
m12 = m12 + rlx_setA*( - m12);
m13 = m13 + rlx_setA*( (jx*jy/rho0) + 0.5*alpha*C*nx*ny - m13);
m14 = m14 + rlx_setA*( (jy*jz/rho0) + 0.5*alpha*C*ny*nz - m14);
m15 = m15 + rlx_setA*( (jx*jz/rho0) + 0.5*alpha*C*nx*nz - m15);
m13 = m13 + rlx_setA*( (jx*jy*irho0) + 0.5*alpha*C*nx*ny - m13);
m14 = m14 + rlx_setA*( (jy*jz*irho0) + 0.5*alpha*C*ny*nz - m14);
m15 = m15 + rlx_setA*( (jx*jz*irho0) + 0.5*alpha*C*nx*nz - m15);
m16 = m16 + rlx_setB*( - m16);
m17 = m17 + rlx_setB*( - m17);
m18 = m18 + rlx_setB*( - m18);
@@ -1776,9 +1781,9 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *A
//........................................................................
// write the velocity
ux = jx / rho0;
uy = jy / rho0;
uz = jz / rho0;
ux = jx*irho0;
uy = jy*irho0;
uz = jz*irho0;
Velocity[n] = ux;
Velocity[Np+n] = uy;
Velocity[2*Np+n] = uz;
@@ -1786,7 +1791,6 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *A
// Instantiate mass transport distributions
// Stationary value - distribution 0
nAB = 1.0/(nA+nB);
Aq[n] = 0.3333333333333333*nA;
Bq[n] = 0.3333333333333333*nB;
@@ -1839,8 +1843,8 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_Color(int *Map, double *dist, double *A
}
}
__global__ void dvc_ScaLBL_D3Q19_AAodd_Color(int *neighborList, int *Map, double *dist, double *Aq, double *Bq, double *Den,
__global__ void
__launch_bounds__(256,1) dvc_ScaLBL_D3Q19_AAodd_Color(int *neighborList, int *Map, double *dist, double *Aq, double *Bq, double *Den,
double *Phi, double *Velocity, double rhoA, double rhoB, double tauA, double tauB, double alpha, double beta,
double Fx, double Fy, double Fz, int strideY, int strideZ, int start, int finish, int Np){
@@ -1861,19 +1865,6 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_Color(int *neighborList, int *Map, double
double ux,uy,uz;
double phi,tau,rho0,rlx_setA,rlx_setB;
const double mrt_V1=0.05263157894736842;
const double mrt_V2=0.012531328320802;
const double mrt_V3=0.04761904761904762;
const double mrt_V4=0.004594820384294068;
const double mrt_V5=0.01587301587301587;
const double mrt_V6=0.0555555555555555555555555;
const double mrt_V7=0.02777777777777778;
const double mrt_V8=0.08333333333333333;
const double mrt_V9=0.003341687552213868;
const double mrt_V10=0.003968253968253968;
const double mrt_V11=0.01388888888888889;
const double mrt_V12=0.04166666666666666;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
@@ -1882,9 +1873,10 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_Color(int *neighborList, int *Map, double
// read the component number densities
nA = Den[n];
nB = Den[Np + n];
nAB = 1.0/(nA+nB);
// compute phase indicator field
phi=(nA-nB)/(nA+nB);
phi=(nA-nB)*nAB;
// local density
rho0=rhoA + 0.5*(1.0-phi)*(rhoB-rhoA);
@@ -1959,11 +1951,11 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_Color(int *neighborList, int *Map, double
//...........Normalize the Color Gradient.................................
C = sqrt(nx*nx+ny*ny+nz*nz);
double ColorMag = C;
if (C==0.0) ColorMag=1.0;
nx = nx/ColorMag;
ny = ny/ColorMag;
nz = nz/ColorMag;
double iColorMag = 1.0/C;
if (C==0.0) iColorMag=1.0;
nx = nx*iColorMag;
ny = ny*iColorMag;
nz = nz*iColorMag;
// q=0
fq = dist[n];
@@ -2290,18 +2282,19 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_Color(int *neighborList, int *Map, double
//..............carry out relaxation process..............................
//..........Toelke, Fruediger et. al. 2006................................
if (C == 0.0) nx = ny = nz = 0.0;
m1 = m1 + rlx_setA*((19*(jx*jx+jy*jy+jz*jz)/rho0 - 11*rho) -19*alpha*C - m1);
m2 = m2 + rlx_setA*((3*rho - 5.5*(jx*jx+jy*jy+jz*jz)/rho0)- m2);
double irho0=1.0/rho0;
m1 = m1 + rlx_setA*((19*(jx*jx+jy*jy+jz*jz)*irho0 - 11*rho) -19*alpha*C - m1);
m2 = m2 + rlx_setA*((3*rho - 5.5*(jx*jx+jy*jy+jz*jz)*irho0)- m2);
m4 = m4 + rlx_setB*((-0.6666666666666666*jx)- m4);
m6 = m6 + rlx_setB*((-0.6666666666666666*jy)- m6);
m8 = m8 + rlx_setB*((-0.6666666666666666*jz)- m8);
m9 = m9 + rlx_setA*(((2*jx*jx-jy*jy-jz*jz)/rho0) + 0.5*alpha*C*(2*nx*nx-ny*ny-nz*nz) - m9);
m9 = m9 + rlx_setA*(((2*jx*jx-jy*jy-jz*jz)*irho0) + 0.5*alpha*C*(2*nx*nx-ny*ny-nz*nz) - m9);
m10 = m10 + rlx_setA*( - m10);
m11 = m11 + rlx_setA*(((jy*jy-jz*jz)/rho0) + 0.5*alpha*C*(ny*ny-nz*nz)- m11);
m11 = m11 + rlx_setA*(((jy*jy-jz*jz)*irho0) + 0.5*alpha*C*(ny*ny-nz*nz)- m11);
m12 = m12 + rlx_setA*( - m12);
m13 = m13 + rlx_setA*( (jx*jy/rho0) + 0.5*alpha*C*nx*ny - m13);
m14 = m14 + rlx_setA*( (jy*jz/rho0) + 0.5*alpha*C*ny*nz - m14);
m15 = m15 + rlx_setA*( (jx*jz/rho0) + 0.5*alpha*C*nx*nz - m15);
m13 = m13 + rlx_setA*( (jx*jy*irho0) + 0.5*alpha*C*nx*ny - m13);
m14 = m14 + rlx_setA*( (jy*jz*irho0) + 0.5*alpha*C*ny*nz - m14);
m15 = m15 + rlx_setA*( (jx*jz*irho0) + 0.5*alpha*C*nx*nz - m15);
m16 = m16 + rlx_setB*( - m16);
m17 = m17 + rlx_setB*( - m17);
m18 = m18 + rlx_setB*( - m18);
@@ -2426,16 +2419,15 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_Color(int *neighborList, int *Map, double
dist[nread] = fq;
// write the velocity
ux = jx / rho0;
uy = jy / rho0;
uz = jz / rho0;
ux = jx*irho0;
uy = jy*irho0;
uz = jz*irho0;
Velocity[n] = ux;
Velocity[Np+n] = uy;
Velocity[2*Np+n] = uz;
// Instantiate mass transport distributions
// Stationary value - distribution 0
nAB = 1.0/(nA+nB);
Aq[n] = 0.3333333333333333*nA;
Bq[n] = 0.3333333333333333*nB;
@@ -3677,7 +3669,8 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_ColorMass(int *neighborList, double *Aq,
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_PhaseField(int *neighborList, int *Map, double *Aq, double *Bq,
__global__ void
__launch_bounds__(256,1) dvc_ScaLBL_D3Q7_AAodd_PhaseField(int *neighborList, int *Map, double *Aq, double *Bq,
double *Den, double *Phi, int start, int finish, int Np){
int idx,n,nread;
double fq,nA,nB;
@@ -3747,7 +3740,8 @@ __global__ void dvc_ScaLBL_D3Q7_AAodd_PhaseField(int *neighborList, int *Map, d
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_PhaseField(int *Map, double *Aq, double *Bq, double *Den, double *Phi,
__global__ void
__launch_bounds__(256,1) dvc_ScaLBL_D3Q7_AAeven_PhaseField(int *Map, double *Aq, double *Bq, double *Den, double *Phi,
int start, int finish, int Np){
int idx,n;
double fq,nA,nB;

View File

@@ -19,7 +19,7 @@
#include "hip/hip_cooperative_groups.h"
#define NBLOCKS 1024
#define NTHREADS 256
#define NTHREADS 512
/*
1. constants that are known at compile time should be defined using preprocessor macros (e.g. #define) or via C/C++ const variables at global/file scope.
@@ -321,10 +321,10 @@ __global__ void dvc_ScaLBL_D3Q19_Swap_Compact(int *neighborList, double *distev
}
}
//__launch_bounds__(512,4)
//__launch_bounds__(512,1)
__global__ void
dvc_ScaLBL_AAodd_Compact(char * ID, int *d_neighborList, double *dist, int Np) {
dvc_ScaLBL_AAodd_Compact( int *d_neighborList, double *dist, int Np) {
int n;
double f0,f1,f2,f3,f4,f5,f6,f7,f8,f9;
@@ -463,7 +463,8 @@ dvc_ScaLBL_AAodd_Compact(char * ID, int *d_neighborList, double *dist, int Np) {
}
__global__ void
__global__ void
__launch_bounds__(512,1)
dvc_ScaLBL_AAodd_MRT(int *neighborList, double *dist, int start, int finish, int Np, double rlx_setA, double rlx_setB, double Fx, double Fy, double Fz) {
int n;
@@ -932,7 +933,8 @@ dvc_ScaLBL_AAodd_MRT(int *neighborList, double *dist, int start, int finish, int
//__launch_bounds__(512,1)
__global__ void
__global__ void
__launch_bounds__(512,1)
dvc_ScaLBL_AAeven_MRT(double *dist, int start, int finish, int Np, double rlx_setA, double rlx_setB, double Fx, double Fy, double Fz) {
int n;
@@ -1353,9 +1355,9 @@ dvc_ScaLBL_AAeven_MRT(double *dist, int start, int finish, int Np, double rlx_se
}
}
//__launch_bounds__(512,4)
//__launch_bounds__(512,1)
__global__ void dvc_ScaLBL_AAeven_Compact(char * ID, double *dist, int Np) {
__global__ void dvc_ScaLBL_AAeven_Compact( double *dist, int Np) {
int n;
double f0,f1,f2,f3,f4,f5,f6,f7,f8,f9;
@@ -2374,12 +2376,12 @@ __global__ void dvc_ScaLBL_D3Q19_Init_Simple(char *ID, double *f_even, double *f
extern "C" void ScaLBL_D3Q19_Pack(int q, int *list, int start, int count, double *sendbuf, double *dist, int N){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q19_Pack <<<GRID,512 >>>(q, list, start, count, sendbuf, dist, N);
dvc_ScaLBL_D3Q19_Pack <<<NBLOCKS,NTHREADS >>>(q, list, start, count, sendbuf, dist, N);
}
extern "C" void ScaLBL_D3Q19_Unpack(int q, int *list, int start, int count, double *recvbuf, double *dist, int N){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q19_Unpack <<<GRID,512 >>>(q, list, start, count, recvbuf, dist, N);
dvc_ScaLBL_D3Q19_Unpack <<<NBLOCKS,NTHREADS >>>(q, list, start, count, recvbuf, dist, N);
}
//*************************************************************************
@@ -2423,19 +2425,17 @@ extern "C" void ScaLBL_D3Q19_Swap_Compact(int *neighborList, double *disteven, d
printf("CUDA error in ScaLBL_D3Q19_Swap: %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q19_AAeven_Compact(char * ID, double *d_dist, int Np) {
extern "C" void ScaLBL_D3Q19_AAeven_Compact( double *d_dist, int Np) {
hipFuncSetCacheConfig( (void*) dvc_ScaLBL_AAeven_Compact, hipFuncCachePreferL1);
dvc_ScaLBL_AAeven_Compact<<<NBLOCKS,NTHREADS>>>(ID, d_dist, Np);
dvc_ScaLBL_AAeven_Compact<<<NBLOCKS,NTHREADS>>>( d_dist, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_Init: %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q19_AAodd_Compact(char * ID, int *d_neighborList, double *d_dist, int Np) {
extern "C" void ScaLBL_D3Q19_AAodd_Compact( int *d_neighborList, double *d_dist, int Np) {
hipFuncSetCacheConfig( (void*) dvc_ScaLBL_AAodd_Compact, hipFuncCachePreferL1);
dvc_ScaLBL_AAodd_Compact<<<NBLOCKS,NTHREADS>>>(ID,d_neighborList, d_dist,Np);
dvc_ScaLBL_AAodd_Compact<<<NBLOCKS,NTHREADS>>>(d_neighborList, d_dist,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("CUDA error in ScaLBL_D3Q19_Init: %s \n",hipGetErrorString(err));

View File

@@ -1,536 +0,0 @@
#include <math.h>
#include <stdio.h>
#include "hip/hip_runtime.h"
#define NBLOCKS 560
#define NTHREADS 128
__global__ void dvc_ScaLBL_Solid_Dirichlet_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count)
{
int idx;
int iq,ib;
double value_b,value_q;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
iq = BounceBackDist_list[idx];
ib = BounceBackSolid_list[idx];
value_b = BoundaryValue[ib];//get boundary value from a solid site
value_q = dist[iq];
dist[iq] = -1.0*value_q + value_b*0.25;//NOTE 0.25 is the speed of sound for D3Q7 lattice
}
}
__global__ void dvc_ScaLBL_Solid_Neumann_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count)
{
int idx;
int iq,ib;
double value_b,value_q;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
iq = BounceBackDist_list[idx];
ib = BounceBackSolid_list[idx];
value_b = BoundaryValue[ib];//get boundary value from a solid site
value_q = dist[iq];
dist[iq] = value_q + value_b;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z(int *list, double *dist, double Vin, int count, int Np)
{
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f6 = dist[5*Np+n];
//...................................................
f5 = Vin - (f0+f1+f2+f3+f4+f6);
dist[6*Np+n] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z(int *list, double *dist, double Vout, int count, int Np)
{
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
//...................................................
f6 = Vout - (f0+f1+f2+f3+f4+f5);
dist[5*Np+n] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z(int *d_neighborList, int *list, double *dist, double Vin, int count, int Np)
{
int idx, n;
int nread,nr5;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
nread = d_neighborList[n+5*Np];
f6 = dist[nread];
// Unknown distributions
nr5 = d_neighborList[n+4*Np];
f5 = Vin - (f0+f1+f2+f3+f4+f6);
dist[nr5] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z(int *d_neighborList, int *list, double *dist, double Vout, int count, int Np)
{
int idx, n;
int nread,nr6;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+4*Np];
f5 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
// unknown distributions
nr6 = d_neighborList[n+5*Np];
f6 = Vout - (f0+f1+f2+f3+f4+f5);
dist[nr6] = f6;
}
}
__global__ void dvc_ScaLBL_Poisson_D3Q7_BC_z(int *list, int *Map, double *Psi, double Vin, int count)
{
int idx,n,nm;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
nm = Map[n];
Psi[nm] = Vin;
}
}
__global__ void dvc_ScaLBL_Poisson_D3Q7_BC_Z(int *list, int *Map, double *Psi, double Vout, int count)
{
int idx,n,nm;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
nm = Map[n];
Psi[nm] = Vout;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z(int *list, double *dist, double Cin, int count, int Np)
{
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f6 = dist[5*Np+n];
//...................................................
f5 = Cin - (f0+f1+f2+f3+f4+f6);
dist[6*Np+n] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z(int *list, double *dist, double Cout, int count, int Np)
{
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
//...................................................
f6 = Cout - (f0+f1+f2+f3+f4+f5);
dist[5*Np+n] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z(int *d_neighborList, int *list, double *dist, double Cin, int count, int Np)
{
int idx, n;
int nread,nr5;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
nread = d_neighborList[n+5*Np];
f6 = dist[nread];
// Unknown distributions
nr5 = d_neighborList[n+4*Np];
f5 = Cin - (f0+f1+f2+f3+f4+f6);
dist[nr5] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z(int *d_neighborList, int *list, double *dist, double Cout, int count, int Np)
{
int idx, n;
int nread,nr6;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+4*Np];
f5 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
// unknown distributions
nr6 = d_neighborList[n+5*Np];
f6 = Cout - (f0+f1+f2+f3+f4+f5);
dist[nr6] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f6 = dist[5*Np+n];
fsum_partial = f0+f1+f2+f3+f4+f6;
uz = VelocityZ[n];
//...................................................
f5 =(FluxIn+(1.0-0.5/tau)*f6-0.5*uz*fsum_partial/tau)/(1.0-0.5/tau+0.5*uz/tau);
dist[6*Np+n] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
fsum_partial = f0+f1+f2+f3+f4+f5;
uz = VelocityZ[n];
//...................................................
f6 =(FluxIn+(1.0-0.5/tau)*f5+0.5*uz*fsum_partial/tau)/(1.0-0.5/tau-0.5*uz/tau);
dist[5*Np+n] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx, n;
int nread,nr5;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
nread = d_neighborList[n+5*Np];
f6 = dist[nread];
fsum_partial = f0+f1+f2+f3+f4+f6;
uz = VelocityZ[n];
//...................................................
f5 =(FluxIn+(1.0-0.5/tau)*f6-0.5*uz*fsum_partial/tau)/(1.0-0.5/tau+0.5*uz/tau);
// Unknown distributions
nr5 = d_neighborList[n+4*Np];
dist[nr5] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx, n;
int nread,nr6;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+4*Np];
f5 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
fsum_partial = f0+f1+f2+f3+f4+f5;
uz = VelocityZ[n];
//...................................................
f6 =(FluxIn+(1.0-0.5/tau)*f5+0.5*uz*fsum_partial/tau)/(1.0-0.5/tau-0.5*uz/tau);
// unknown distributions
nr6 = d_neighborList[n+5*Np];
dist[nr6] = f6;
}
}
//*************************************************************************
extern "C" void ScaLBL_Solid_Dirichlet_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_Dirichlet_D3Q7<<<GRID,512>>>(dist, BoundaryValue, BounceBackDist_list, BounceBackSolid_list, count);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_Solid_Dirichlet_D3Q7 (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_Solid_Neumann_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_Neumann_D3Q7<<<GRID,512>>>(dist, BoundaryValue, BounceBackDist_list, BounceBackSolid_list, count);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_Solid_Neumann_D3Q7 (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z(int *list, double *dist, double Vin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z<<<GRID,512>>>(list, dist, Vin, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z(int *list, double *dist, double Vout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z<<<GRID,512>>>(list, dist, Vout, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z(int *d_neighborList, int *list, double *dist, double Vin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z<<<GRID,512>>>(d_neighborList, list, dist, Vin, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z(int *d_neighborList, int *list, double *dist, double Vout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, Vout, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_Poisson_D3Q7_BC_z(int *list, int *Map, double *Psi, double Vin, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Poisson_D3Q7_BC_z<<<GRID,512>>>(list, Map, Psi, Vin, count);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_Poisson_D3Q7_BC_z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_Poisson_D3Q7_BC_Z(int *list, int *Map, double *Psi, double Vout, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Poisson_D3Q7_BC_Z<<<GRID,512>>>(list, Map, Psi, Vout, count);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_Poisson_D3Q7_BC_Z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z(int *list, double *dist, double Cin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z<<<GRID,512>>>(list, dist, Cin, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z(int *list, double *dist, double Cout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z<<<GRID,512>>>(list, dist, Cout, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z(int *d_neighborList, int *list, double *dist, double Cin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z<<<GRID,512>>>(d_neighborList, list, dist, Cin, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z(int *d_neighborList, int *list, double *dist, double Cout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, Cout, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_BC_z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_Ion_Flux_BC_z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_BC_Z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_Ion_Flux_BC_Z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_BC_z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_Ion_Flux_BC_z (kernel): %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_Ion_Flux_BC_Z (kernel): %s \n",hipGetErrorString(err));
}
}

917
hip/D3Q7BC.hip Normal file
View File

@@ -0,0 +1,917 @@
#include <math.h>
#include <stdio.h>
#include "hip/hip_runtime.h"
#define NBLOCKS 1024
#define NTHREADS 256
#define CHECK_ERROR(KERNEL) \
do { \
auto err = hipGetLastError(); \
if ( hipSuccess != err ){ \
auto errString = hipGetErrorString(err); \
printf("error in %s (kernel): %s \n",KERNEL,errString); \
} \
} while(0)
__global__ void dvc_ScaLBL_Solid_Dirichlet_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count)
{
int idx;
int iq,ib;
double value_b,value_q;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
iq = BounceBackDist_list[idx];
ib = BounceBackSolid_list[idx];
value_b = BoundaryValue[ib];//get boundary value from a solid site
value_q = dist[iq];
dist[iq] = -1.0*value_q + value_b*0.25;//NOTE 0.25 is the speed of sound for D3Q7 lattice
}
}
__global__ void dvc_ScaLBL_Solid_Neumann_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count)
{
int idx;
int iq,ib;
double value_b,value_q;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
iq = BounceBackDist_list[idx];
ib = BounceBackSolid_list[idx];
value_b = BoundaryValue[ib];//get boundary value from a solid site
value_q = dist[iq];
dist[iq] = value_q + value_b;
}
}
__global__ void dvc_ScaLBL_Solid_DirichletAndNeumann_D3Q7(double *dist, double *BoundaryValue,int *BoundaryLabel, int *BounceBackDist_list, int *BounceBackSolid_list, int count)
{
int idx;
int iq,ib;
double value_b,value_b_label,value_q;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
iq = BounceBackDist_list[idx];
ib = BounceBackSolid_list[idx];
value_b = BoundaryValue[ib];//get boundary value from a solid site
value_b_label = BoundaryLabel[ib];//get boundary label (i.e. type of BC) from a solid site
value_q = dist[iq];
if (value_b_label==1){//Dirichlet BC
dist[iq] = -1.0*value_q + value_b*0.25;//NOTE 0.25 is the speed of sound for D3Q7 lattice
}
if (value_b_label==2){//Neumann BC
dist[iq] = value_q + value_b;
}
}
}
__global__ void dvc_ScaLBL_Solid_SlippingVelocityBC_D3Q19(double *dist, double *zeta_potential, double *ElectricField, double *SolidGrad,
double epsilon_LB, double tau, double rho0,double den_scale, double h, double time_conv,
int *BounceBackDist_list, int *BounceBackSolid_list, int *FluidBoundary_list,
double *lattice_weight, float *lattice_cx, float *lattice_cy, float *lattice_cz,
int count, int Np)
{
int idx;
int iq,ib,ifluidBC;
double value_b,value_q;
double Ex,Ey,Ez;
double Etx,Ety,Etz;//tangential part of electric field
double E_mag_normal;
double nsx,nsy,nsz;//unit normal solid gradient
double ubx,uby,ubz;//slipping velocity at fluid boundary nodes
float cx,cy,cz;//lattice velocity (D3Q19)
double LB_weight;//lattice weighting coefficient (D3Q19)
double cs2_inv = 3.0;//inverse of cs^2 for D3Q19
double nu_LB = (tau-0.5)/cs2_inv;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
iq = BounceBackDist_list[idx];
ib = BounceBackSolid_list[idx];
ifluidBC = FluidBoundary_list[idx];
value_b = zeta_potential[ib];//get zeta potential from a solid site
value_q = dist[iq];
//Load electric field and compute its tangential componet
Ex = ElectricField[ifluidBC+0*Np];
Ey = ElectricField[ifluidBC+1*Np];
Ez = ElectricField[ifluidBC+2*Np];
nsx = SolidGrad[ifluidBC+0*Np];
nsy = SolidGrad[ifluidBC+1*Np];
nsz = SolidGrad[ifluidBC+2*Np];
E_mag_normal = Ex*nsx+Ey*nsy+Ez*nsz;//magnitude of electric field in the direction normal to solid nodes
//compute tangential electric field
Etx = Ex - E_mag_normal*nsx;
Ety = Ey - E_mag_normal*nsy;
Etz = Ez - E_mag_normal*nsz;
ubx = -epsilon_LB*value_b*Etx/(nu_LB*rho0)*time_conv*time_conv/(h*h*1.0e-12)/den_scale;
uby = -epsilon_LB*value_b*Ety/(nu_LB*rho0)*time_conv*time_conv/(h*h*1.0e-12)/den_scale;
ubz = -epsilon_LB*value_b*Etz/(nu_LB*rho0)*time_conv*time_conv/(h*h*1.0e-12)/den_scale;
//compute bounce-back distribution
LB_weight = lattice_weight[idx];
cx = lattice_cx[idx];
cy = lattice_cy[idx];
cz = lattice_cz[idx];
dist[iq] = value_q - 2.0*LB_weight*rho0*cs2_inv*(cx*ubx+cy*uby+cz*ubz);
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z(int *list, double *dist, double Vin, int count, int Np)
{
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f6 = dist[5*Np+n];
//...................................................
f5 = Vin - (f0+f1+f2+f3+f4+f6);
dist[6*Np+n] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z(int *list, double *dist, double Vout, int count, int Np)
{
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
//...................................................
f6 = Vout - (f0+f1+f2+f3+f4+f5);
dist[5*Np+n] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z(int *d_neighborList, int *list, double *dist, double Vin, int count, int Np)
{
int idx, n;
int nread,nr5;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
nread = d_neighborList[n+5*Np];
f6 = dist[nread];
// Unknown distributions
nr5 = d_neighborList[n+4*Np];
f5 = Vin - (f0+f1+f2+f3+f4+f6);
dist[nr5] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z(int *d_neighborList, int *list, double *dist, double Vout, int count, int Np)
{
int idx, n;
int nread,nr6;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+4*Np];
f5 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
// unknown distributions
nr6 = d_neighborList[n+5*Np];
f6 = Vout - (f0+f1+f2+f3+f4+f5);
dist[nr6] = f6;
}
}
__global__ void dvc_ScaLBL_Poisson_D3Q7_BC_z(int *list, int *Map, double *Psi, double Vin, int count)
{
int idx,n,nm;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
nm = Map[n];
Psi[nm] = Vin;
}
}
__global__ void dvc_ScaLBL_Poisson_D3Q7_BC_Z(int *list, int *Map, double *Psi, double Vout, int count)
{
int idx,n,nm;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
nm = Map[n];
Psi[nm] = Vout;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z(int *list, double *dist, double Cin, int count, int Np)
{
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f6 = dist[5*Np+n];
//...................................................
f5 = Cin - (f0+f1+f2+f3+f4+f6);
dist[6*Np+n] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z(int *list, double *dist, double Cout, int count, int Np)
{
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
//...................................................
f6 = Cout - (f0+f1+f2+f3+f4+f5);
dist[5*Np+n] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z(int *d_neighborList, int *list, double *dist, double Cin, int count, int Np)
{
int idx, n;
int nread,nr5;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
nread = d_neighborList[n+5*Np];
f6 = dist[nread];
// Unknown distributions
nr5 = d_neighborList[n+4*Np];
f5 = Cin - (f0+f1+f2+f3+f4+f6);
dist[nr5] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z(int *d_neighborList, int *list, double *dist, double Cout, int count, int Np)
{
int idx, n;
int nread,nr6;
double f0,f1,f2,f3,f4,f5,f6;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+4*Np];
f5 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
// unknown distributions
nr6 = d_neighborList[n+5*Np];
f6 = Cout - (f0+f1+f2+f3+f4+f5);
dist[nr6] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f6 = dist[5*Np+n];
fsum_partial = f0+f1+f2+f3+f4+f6;
uz = VelocityZ[n];
//...................................................
f5 =(FluxIn+(1.0-0.5/tau)*(f6+uz*fsum_partial))/(1.0-0.5/tau)/(1.0-uz);
dist[6*Np+n] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
fsum_partial = f0+f1+f2+f3+f4+f5;
uz = VelocityZ[n];
//...................................................
f6 =(FluxIn+(1.0-0.5/tau)*(f5-uz*fsum_partial))/(1.0-0.5/tau)/(1.0+uz);
dist[5*Np+n] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx, n;
int nread,nr5;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
nread = d_neighborList[n+5*Np];
f6 = dist[nread];
fsum_partial = f0+f1+f2+f3+f4+f6;
uz = VelocityZ[n];
//...................................................
f5 =(FluxIn+(1.0-0.5/tau)*(f6+uz*fsum_partial))/(1.0-0.5/tau)/(1.0-uz);
// Unknown distributions
nr5 = d_neighborList[n+4*Np];
dist[nr5] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx, n;
int nread,nr6;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+4*Np];
f5 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
fsum_partial = f0+f1+f2+f3+f4+f5;
uz = VelocityZ[n];
//...................................................
f6 =(FluxIn+(1.0-0.5/tau)*(f5-uz*fsum_partial))/(1.0-0.5/tau)/(1.0+uz);
// unknown distributions
nr6 = d_neighborList[n+5*Np];
dist[nr6] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f6 = dist[5*Np+n];
fsum_partial = f0+f1+f2+f3+f4+f6;
uz = VelocityZ[n];
//...................................................
f5 =(FluxIn+(1.0-0.5/tau)*f6-0.5*uz*fsum_partial/tau)/(1.0-0.5/tau+0.5*uz/tau);
dist[6*Np+n] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
fsum_partial = f0+f1+f2+f3+f4+f5;
uz = VelocityZ[n];
//...................................................
f6 =(FluxIn+(1.0-0.5/tau)*f5+0.5*uz*fsum_partial/tau)/(1.0-0.5/tau-0.5*uz/tau);
dist[5*Np+n] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx, n;
int nread,nr5;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
nread = d_neighborList[n+5*Np];
f6 = dist[nread];
fsum_partial = f0+f1+f2+f3+f4+f6;
uz = VelocityZ[n];
//...................................................
f5 =(FluxIn+(1.0-0.5/tau)*f6-0.5*uz*fsum_partial/tau)/(1.0-0.5/tau+0.5*uz/tau);
// Unknown distributions
nr5 = d_neighborList[n+4*Np];
dist[nr5] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx, n;
int nread,nr6;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+4*Np];
f5 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
fsum_partial = f0+f1+f2+f3+f4+f5;
uz = VelocityZ[n];
//...................................................
f6 =(FluxIn+(1.0-0.5/tau)*f5+0.5*uz*fsum_partial/tau)/(1.0-0.5/tau-0.5*uz/tau);
// unknown distributions
nr6 = d_neighborList[n+5*Np];
dist[nr6] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
double uEPz;//electrochemical induced velocity
double Ez;//electrical field
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f6 = dist[5*Np+n];
fsum_partial = f0+f1+f2+f3+f4+f6;
uz = VelocityZ[n];
Ez = ElectricField_Z[n];
uEPz=zi*Di/Vt*Ez;
//...................................................
f5 =(FluxIn+(1.0-0.5/tau)*f6-(0.5*uz/tau+uEPz)*fsum_partial)/(1.0-0.5/tau+0.5*uz/tau+uEPz);
dist[6*Np+n] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx,n;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
double uEPz;//electrochemical induced velocity
double Ez;//electrical field
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
fsum_partial = f0+f1+f2+f3+f4+f5;
uz = VelocityZ[n];
Ez = ElectricField_Z[n];
uEPz=zi*Di/Vt*Ez;
//...................................................
f6 =(FluxIn+(1.0-0.5/tau)*f5+(0.5*uz/tau+uEPz)*fsum_partial)/(1.0-0.5/tau-0.5*uz/tau-uEPz);
dist[5*Np+n] = f6;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx, n;
int nread,nr5;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
double uEPz;//electrochemical induced velocity
double Ez;//electrical field
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
nread = d_neighborList[n+5*Np];
f6 = dist[nread];
fsum_partial = f0+f1+f2+f3+f4+f6;
uz = VelocityZ[n];
Ez = ElectricField_Z[n];
uEPz=zi*Di/Vt*Ez;
//...................................................
f5 =(FluxIn+(1.0-0.5/tau)*f6-(0.5*uz/tau+uEPz)*fsum_partial)/(1.0-0.5/tau+0.5*uz/tau+uEPz);
// Unknown distributions
nr5 = d_neighborList[n+4*Np];
dist[nr5] = f5;
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np)
{
//NOTE: FluxIn is the inward flux
int idx, n;
int nread,nr6;
double f0,f1,f2,f3,f4,f5,f6;
double fsum_partial;
double uz;
double uEPz;//electrochemical induced velocity
double Ez;//electrical field
idx = blockIdx.x*blockDim.x + threadIdx.x;
if (idx < count){
n = list[idx];
f0 = dist[n];
nread = d_neighborList[n];
f1 = dist[nread];
nread = d_neighborList[n+2*Np];
f3 = dist[nread];
nread = d_neighborList[n+4*Np];
f5 = dist[nread];
nread = d_neighborList[n+Np];
f2 = dist[nread];
nread = d_neighborList[n+3*Np];
f4 = dist[nread];
fsum_partial = f0+f1+f2+f3+f4+f5;
uz = VelocityZ[n];
Ez = ElectricField_Z[n];
uEPz=zi*Di/Vt*Ez;
//...................................................
f6 =(FluxIn+(1.0-0.5/tau)*f5+(0.5*uz/tau+uEPz)*fsum_partial)/(1.0-0.5/tau-0.5*uz/tau-uEPz);
// unknown distributions
nr6 = d_neighborList[n+5*Np];
dist[nr6] = f6;
}
}
//*************************************************************************
extern "C" void ScaLBL_Solid_Dirichlet_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_Dirichlet_D3Q7<<<GRID,512>>>(dist, BoundaryValue, BounceBackDist_list, BounceBackSolid_list, count);
CHECK_ERROR("ScaLBL_Solid_Dirichlet_D3Q7");
}
extern "C" void ScaLBL_Solid_Neumann_D3Q7(double *dist, double *BoundaryValue, int *BounceBackDist_list, int *BounceBackSolid_list, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_Neumann_D3Q7<<<GRID,512>>>(dist, BoundaryValue, BounceBackDist_list, BounceBackSolid_list, count);
CHECK_ERROR("ScaLBL_Solid_Neumann_D3Q7");
}
extern "C" void ScaLBL_Solid_DirichletAndNeumann_D3Q7(double *dist, double *BoundaryValue,int *BoundaryLabel, int *BounceBackDist_list, int *BounceBackSolid_list, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_DirichletAndNeumann_D3Q7<<<GRID,512>>>(dist, BoundaryValue, BoundaryLabel, BounceBackDist_list, BounceBackSolid_list, count);
CHECK_ERROR("ScaLBL_Solid_DirichletAndNeumann_D3Q7");
}
extern "C" void ScaLBL_Solid_SlippingVelocityBC_D3Q19(double *dist, double *zeta_potential, double *ElectricField, double *SolidGrad,
double epsilon_LB, double tau, double rho0,double den_scale, double h, double time_conv,
int *BounceBackDist_list, int *BounceBackSolid_list, int *FluidBoundary_list,
double *lattice_weight, float *lattice_cx, float *lattice_cy, float *lattice_cz,
int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_Solid_SlippingVelocityBC_D3Q19<<<GRID,512>>>(dist, zeta_potential, ElectricField, SolidGrad,
epsilon_LB, tau, rho0, den_scale, h, time_conv,
BounceBackDist_list, BounceBackSolid_list, FluidBoundary_list,
lattice_weight, lattice_cx, lattice_cy, lattice_cz,
count, Np);
CHECK_ERROR("ScaLBL_Solid_SlippingVelocityBC_D3Q19");
}
extern "C" void ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z(int *list, double *dist, double Vin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z<<<GRID,512>>>(list, dist, Vin, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z(int *list, double *dist, double Vout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z<<<GRID,512>>>(list, dist, Vout, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Poisson_Potential_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z(int *d_neighborList, int *list, double *dist, double Vin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z<<<GRID,512>>>(d_neighborList, list, dist, Vin, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z(int *d_neighborList, int *list, double *dist, double Vout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, Vout, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Poisson_Potential_BC_Z");
}
extern "C" void ScaLBL_Poisson_D3Q7_BC_z(int *list, int *Map, double *Psi, double Vin, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Poisson_D3Q7_BC_z<<<GRID,512>>>(list, Map, Psi, Vin, count);
CHECK_ERROR("ScaLBL_Poisson_D3Q7_BC_z");
}
extern "C" void ScaLBL_Poisson_D3Q7_BC_Z(int *list, int *Map, double *Psi, double Vout, int count){
int GRID = count / 512 + 1;
dvc_ScaLBL_Poisson_D3Q7_BC_Z<<<GRID,512>>>(list, Map, Psi, Vout, count);
CHECK_ERROR("ScaLBL_Poisson_D3Q7_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z(int *list, double *dist, double Cin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z<<<GRID,512>>>(list, dist, Cin, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z(int *list, double *dist, double Cout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z<<<GRID,512>>>(list, dist, Cout, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Concentration_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z(int *d_neighborList, int *list, double *dist, double Cin, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z<<<GRID,512>>>(d_neighborList, list, dist, Cin, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z(int *d_neighborList, int *list, double *dist, double Cout, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, Cout, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Concentration_BC_Z");
}
//------------Diff-----------------
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_Z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_Diff_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_Diff_BC_Z");
}
//----------DiffAdvc-------------
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_Z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvc_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvc_BC_Z");
}
//----------DiffAdvcElec-------------
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, ElectricField_Z, Di, zi, Vt, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_Z(int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_Z<<<GRID,512>>>(list, dist, FluxIn, tau, VelocityZ, ElectricField_Z, Di, zi, Vt, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAeven_Ion_Flux_DiffAdvcElec_BC_Z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, ElectricField_Z, Di, zi, Vt, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_z");
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z(int *d_neighborList, int *list, double *dist, double FluxIn, double tau, double *VelocityZ, double *ElectricField_Z,
double Di, double zi, double Vt, int count, int Np){
int GRID = count / 512 + 1;
dvc_ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z<<<GRID,512>>>(d_neighborList, list, dist, FluxIn, tau, VelocityZ, ElectricField_Z, Di, zi, Vt, count, Np);
CHECK_ERROR("ScaLBL_D3Q7_AAodd_Ion_Flux_DiffAdvcElec_BC_Z");
}
//-------------------------------

View File

@@ -2726,10 +2726,10 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel_Combined(int *Map, double *
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_FreeLeeModel_PhaseField(int *neighborList, int *Map, double *hq, double *Den, double *Phi,
__global__ void dvc_ScaLBL_D3Q7_AAodd_FreeLeeModel_PhaseField_alt(int *neighborList, int *Map, double *hq, double *Den, double *Phi,
double rhoA, double rhoB, int start, int finish, int Np){
int idx,nread;
int n,idx,nread;
double fq,phi;
// for (int n=start; n<finish; n++){
@@ -2787,7 +2787,7 @@ __global__ void dvc_ScaLBL_D3Q7_AAodd_FreeLeeModel_PhaseField(int *neighborList,
__global__ void dvc_ScaLBL_D3Q7_AAeven_FreeLeeModel_PhaseField(int *Map, double *hq, double *Den, double *Phi,
double rhoA, double rhoB, int start, int finish, int Np){
int idx;
int n,idx;
double fq,phi;
// for (int n=start; n<finish; n++){
int S = Np/NBLOCKS/NTHREADS + 1;
@@ -2833,7 +2833,6 @@ __global__ void dvc_ScaLBL_D3Q7_AAeven_FreeLeeModel_PhaseField(int *Map, double
idx = Map[n];
Phi[idx] = phi;
}
}
}
__global__ void dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel_SingleFluid_BGK(int *neighborList, double *dist, double *Vel, double *Pressure,
@@ -3396,7 +3395,7 @@ extern "C" void ScaLBL_FreeLeeModel_PhaseField_Init(int *Map, double *Phi, doubl
extern "C" void ScaLBL_D3Q7_AAodd_FreeLee_PhaseField(int *neighborList, int *Map, double *hq, double *Den, double *Phi, double *ColorGrad, double *Vel,
double rhoA, double rhoB, double tauM, double W, int start, int finish, int Np)
{
hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q7_AAodd_FreeLee_PhaseField, hipFuncCachePreferL1);
//hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q7_AAodd_FreeLee_PhaseField, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q7_AAodd_FreeLee_PhaseField<<<NBLOCKS,NTHREADS >>>(neighborList, Map, hq, Den, Phi, ColorGrad, Vel,
rhoA, rhoB, tauM, W, start, finish, Np);
hipError_t err = hipGetLastError();
@@ -3406,9 +3405,9 @@ extern "C" void ScaLBL_D3Q7_AAodd_FreeLee_PhaseField(int *neighborList, int *Map
}
extern "C" void ScaLBL_D3Q7_AAeven_FreeLee_PhaseField( int *Map, double *hq, double *Den, double *Phi, double *ColorGrad, double *Vel,
double rhoA, double rhoB, double tauM, double W, int start, int finish, int Np){
hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q7_AAeven_FreeLee_PhaseField, hipFuncCachePreferL1);
double rhoA, double rhoB, double tauM, double W, int start, int finish, int Np)
{
//hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q7_AAeven_FreeLee_PhaseField, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q7_AAeven_FreeLee_PhaseField<<<NBLOCKS,NTHREADS >>>( Map, hq, Den, Phi, ColorGrad, Vel, rhoA, rhoB, tauM, W, start, finish, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
@@ -3419,7 +3418,7 @@ extern "C" void ScaLBL_D3Q7_AAeven_FreeLee_PhaseField( int *Map, double *hq, dou
extern "C" void ScaLBL_D3Q7_ComputePhaseField(int *Map, double *hq, double *Den, double *Phi, double rhoA, double rhoB, int start, int finish, int Np){
hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q7_ComputePhaseField, hipFuncCachePreferL1);
//hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q7_ComputePhaseField, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q7_ComputePhaseField<<<NBLOCKS,NTHREADS >>>( Map, hq, Den, Phi, rhoA, rhoB, start, finish, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
@@ -3432,7 +3431,7 @@ extern "C" void ScaLBL_D3Q19_AAodd_FreeLeeModel(int *neighborList, int *Map, dou
double rhoA, double rhoB, double tauA, double tauB, double kappa, double beta, double W, double Fx, double Fy, double Fz,
int strideY, int strideZ, int start, int finish, int Np){
hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel, hipFuncCachePreferL1);
//hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel<<<NBLOCKS,NTHREADS >>>(neighborList, Map, dist, Den, Phi, mu_phi, Vel, Pressure, ColorGrad,
rhoA, rhoB, tauA, tauB, kappa, beta, W, Fx, Fy, Fz, strideY, strideZ, start, finish, Np);
hipError_t err = hipGetLastError();
@@ -3445,7 +3444,7 @@ extern "C" void ScaLBL_D3Q19_AAeven_FreeLeeModel(int *Map, double *dist, double
double rhoA, double rhoB, double tauA, double tauB, double kappa, double beta, double W, double Fx, double Fy, double Fz,
int strideY, int strideZ, int start, int finish, int Np){
hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel, hipFuncCachePreferL1);
//hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel<<<NBLOCKS,NTHREADS >>>(Map, dist, Den, Phi, mu_phi, Vel, Pressure, ColorGrad,
rhoA, rhoB, tauA, tauB, kappa, beta, W, Fx, Fy, Fz, strideY, strideZ, start, finish, Np);
hipError_t err = hipGetLastError();
@@ -3458,7 +3457,7 @@ extern "C" void ScaLBL_D3Q19_AAeven_FreeLeeModel(int *Map, double *dist, double
extern "C" void ScaLBL_D3Q19_AAeven_FreeLeeModel_Combined(int *Map, double *dist, double *hq, double *Den, double *Phi, double *mu_phi, double *Vel, double *Pressure, double *ColorGrad,
double rhoA, double rhoB, double tauA, double tauB, double tauM, double kappa, double beta, double W, double Fx, double Fy, double Fz,
int strideY, int strideZ, int start, int finish, int Np){
cudaFuncSetCacheConfig(dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel_Combined, cudaFuncCachePreferL1);
//hipFuncSetCacheConfig(dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel_Combined, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel_Combined<<<NBLOCKS,NTHREADS >>>(Map, dist, Den, hq, Phi, mu_phi, Vel, Pressure, ColorGrad,
rhoA, rhoB, tauA, tauB, tauM, kappa, beta, W, Fx, Fy, Fz, strideY, strideZ, start, finish, Np);
hipError_t err = hipGetLastError();
@@ -3471,7 +3470,7 @@ extern "C" void ScaLBL_D3Q19_AAodd_FreeLeeModel_Combined(int *neighborList, int
double rhoA, double rhoB, double tauA, double tauB, double tauM, double kappa, double beta, double W, double Fx, double Fy, double Fz,
int strideY, int strideZ, int start, int finish, int Np){
cudaFuncSetCacheConfig(dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel_Combined, cudaFuncCachePreferL1);
//hipFuncSetCacheConfig(dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel_Combined, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel_Combined<<<NBLOCKS,NTHREADS >>>(neighborList, Map, dist, hq, Den, Phi, mu_phi, Vel, Pressure, ColorGrad,
rhoA, rhoB, tauA, tauB, tauM, kappa, beta, W, Fx, Fy, Fz, strideY, strideZ, start, finish, Np);
hipError_t err = hipGetLastError();
@@ -3482,7 +3481,7 @@ extern "C" void ScaLBL_D3Q19_AAodd_FreeLeeModel_Combined(int *neighborList, int
extern "C" void ScaLBL_D3Q7_AAodd_FreeLeeModel_PhaseField(int *neighborList, int *Map, double *hq, double *Den, double *Phi,
double rhoA, double rhoB, int start, int finish, int Np){
cudaFuncSetCacheConfig(dvc_ScaLBL_D3Q7_AAodd_FreeLeeModel_PhaseField, cudaFuncCachePreferL1);
//hipFuncSetCacheConfig(dvc_ScaLBL_D3Q7_AAodd_FreeLeeModel_PhaseField, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q7_AAodd_FreeLeeModel_PhaseField<<<NBLOCKS,NTHREADS >>>( neighborList, Map, hq, Den, Phi, rhoA, rhoB, start, finish, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
@@ -3492,7 +3491,7 @@ extern "C" void ScaLBL_D3Q7_AAodd_FreeLeeModel_PhaseField(int *neighborList, int
extern "C" void ScaLBL_D3Q7_AAeven_FreeLeeModel_PhaseField(int *Map, double *hq, double *Den, double *Phi, double rhoA, double rhoB, int start, int finish, int Np){
cudaFuncSetCacheConfig(dvc_ScaLBL_D3Q7_AAeven_FreeLeeModel_PhaseField, cudaFuncCachePreferL1);
//hipFuncSetCacheConfig(dvc_ScaLBL_D3Q7_AAeven_FreeLeeModel_PhaseField, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q7_AAeven_FreeLeeModel_PhaseField<<<NBLOCKS,NTHREADS >>>( Map, hq, Den, Phi, rhoA, rhoB, start, finish, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
@@ -3503,7 +3502,7 @@ extern "C" void ScaLBL_D3Q7_AAeven_FreeLeeModel_PhaseField(int *Map, double *hq,
extern "C" void ScaLBL_D3Q19_AAodd_FreeLeeModel_SingleFluid_BGK(int *neighborList, double *dist, double *Vel, double *Pressure,
double tau, double rho0, double Fx, double Fy, double Fz, int start, int finish, int Np){
hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel_SingleFluid_BGK, hipFuncCachePreferL1);
//hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel_SingleFluid_BGK, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q19_AAodd_FreeLeeModel_SingleFluid_BGK<<<NBLOCKS,NTHREADS >>>(neighborList, dist, Vel, Pressure,
tau, rho0, Fx, Fy, Fz, start, finish, Np);
hipError_t err = hipGetLastError();
@@ -3513,9 +3512,9 @@ extern "C" void ScaLBL_D3Q19_AAodd_FreeLeeModel_SingleFluid_BGK(int *neighborLis
}
extern "C" void ScaLBL_D3Q19_AAeven_FreeLeeModel_SingleFluid_BGK(double *dist, double *Vel, double *Pressure,
double tau, double rho0, double Fx, double Fy, double Fz, int start, int finish, int Np){
hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel_SingleFluid_BGK, hipFuncCachePreferL1);
double tau, double rho0, double Fx, double Fy, double Fz, int start, int finish, int Np)
{
//hipFuncSetCacheConfig((void*)dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel_SingleFluid_BGK, hipFuncCachePreferL1);
dvc_ScaLBL_D3Q19_AAeven_FreeLeeModel_SingleFluid_BGK<<<NBLOCKS,NTHREADS >>>(dist, Vel, Pressure,
tau, rho0, Fx, Fy, Fz, start, finish, Np);
hipError_t err = hipGetLastError();

View File

@@ -1,5 +1,6 @@
#include <stdio.h>
#include <math.h>
#include "hip/hip_runtime.h"
#define NBLOCKS 1024
#define NTHREADS 256
@@ -1609,7 +1610,9 @@ __global__ void dvc_ScaLBL_D3Q19_AAodd_GreyscaleColor_CP(int *neighborList, int
Fcpy = ny;
Fcpz = nz;
double Fcp_mag=sqrt(Fcpx*Fcpx+Fcpy*Fcpy+Fcpz*Fcpz);
if (Fcp_mag==0.0); Fcpx=Fcpy=Fcpz=0.0;
if (Fcp_mag==0.0) {
Fcpx=Fcpy=Fcpz=0.0;
}
//NOTE for open node (porosity=1.0),Fcp=0.0
Fcpx *= alpha*W*(1.0-porosity)/sqrt(perm);
Fcpy *= alpha*W*(1.0-porosity)/sqrt(perm);
@@ -2396,7 +2399,9 @@ __global__ void dvc_ScaLBL_D3Q19_AAeven_GreyscaleColor_CP(int *Map, double *dis
Fcpy = ny;
Fcpz = nz;
double Fcp_mag=sqrt(Fcpx*Fcpx+Fcpy*Fcpy+Fcpz*Fcpz);
if (Fcp_mag==0.0); Fcpx=Fcpy=Fcpz=0.0;
if (Fcp_mag==0.0) {
Fcpx=Fcpy=Fcpz=0.0;
}
//NOTE for open node (porosity=1.0),Fcp=0.0
Fcpx *= alpha*W*(1.0-porosity)/sqrt(perm);
Fcpy *= alpha*W*(1.0-porosity)/sqrt(perm);

View File

@@ -1,422 +0,0 @@
#include <stdio.h>
#include <math.h>
#include "hip/hip_runtime.h"
#define NBLOCKS 1024
#define NTHREADS 256
__global__ void dvc_ScaLBL_D3Q7_AAodd_IonConcentration(int *neighborList, double *dist, double *Den, int start, int finish, int Np){
int n,nread;
double fq,Ci;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
// q=0
fq = dist[n];
Ci = fq;
// q=1
nread = neighborList[n];
fq = dist[nread];
Ci += fq;
// q=2
nread = neighborList[n+Np];
fq = dist[nread];
Ci += fq;
// q=3
nread = neighborList[n+2*Np];
fq = dist[nread];
Ci += fq;
// q=4
nread = neighborList[n+3*Np];
fq = dist[nread];
Ci += fq;
// q=5
nread = neighborList[n+4*Np];
fq = dist[nread];
Ci += fq;
// q=6
nread = neighborList[n+5*Np];
fq = dist[nread];
Ci += fq;
Den[n]=Ci;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_IonConcentration(double *dist, double *Den, int start, int finish, int Np){
int n;
double fq,Ci;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
// q=0
fq = dist[n];
Ci = fq;
// q=1
fq = dist[2*Np+n];
Ci += fq;
// q=2
fq = dist[1*Np+n];
Ci += fq;
// q=3
fq = dist[4*Np+n];
Ci += fq;
// q=4
fq = dist[3*Np+n];
Ci += fq;
// q=5
fq = dist[6*Np+n];
Ci += fq;
// q=6
fq = dist[5*Np+n];
Ci += fq;
Den[n]=Ci;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective, double *FluxElectrical, double *Velocity, double *ElectricField,
double Di, int zi, double rlx, double Vt, int start, int finish, int Np){
int n;
double Ci;
double ux,uy,uz;
double uEPx,uEPy,uEPz;//electrochemical induced velocity
double Ex,Ey,Ez;//electrical field
double flux_diffusive_x,flux_diffusive_y,flux_diffusive_z;
double f0,f1,f2,f3,f4,f5,f6;
int nr1,nr2,nr3,nr4,nr5,nr6;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ci=Den[n];
Ex=ElectricField[n+0*Np];
Ey=ElectricField[n+1*Np];
Ez=ElectricField[n+2*Np];
ux=Velocity[n+0*Np];
uy=Velocity[n+1*Np];
uz=Velocity[n+2*Np];
uEPx=zi*Di/Vt*Ex;
uEPy=zi*Di/Vt*Ey;
uEPz=zi*Di/Vt*Ez;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
// q=2
nr2 = neighborList[n+Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n+2*Np]; // neighbor 4
f3 = dist[nr3];
// q=4
nr4 = neighborList[n+3*Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n+4*Np];
f5 = dist[nr5];
// q=6
nr6 = neighborList[n+5*Np];
f6 = dist[nr6];
// compute diffusive flux
flux_diffusive_x = (1.0-0.5*rlx)*((f1-f2)-ux*Ci);
flux_diffusive_y = (1.0-0.5*rlx)*((f3-f4)-uy*Ci);
flux_diffusive_z = (1.0-0.5*rlx)*((f5-f6)-uz*Ci);
FluxDiffusive[n+0*Np] = flux_diffusive_x;
FluxDiffusive[n+1*Np] = flux_diffusive_y;
FluxDiffusive[n+2*Np] = flux_diffusive_z;
FluxAdvective[n+0*Np] = ux*Ci;
FluxAdvective[n+1*Np] = uy*Ci;
FluxAdvective[n+2*Np] = uz*Ci;
FluxElectrical[n+0*Np] = uEPx*Ci;
FluxElectrical[n+1*Np] = uEPy*Ci;
FluxElectrical[n+2*Np] = uEPz*Ci;
// q=0
dist[n] = f0*(1.0-rlx)+rlx*0.25*Ci;
//dist[n] = f0*(1.0-rlx)+rlx*0.25*Ci*(1.0 - 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 1
dist[nr2] = f1*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(ux+uEPx));
//dist[nr2] = f1*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(ux+uEPx)+8.0*(ux+uEPx)*(ux+uEPx)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q=2
dist[nr1] = f2*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(ux+uEPx));
//dist[nr1] = f2*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(ux+uEPx)+8.0*(ux+uEPx)*(ux+uEPx)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 3
dist[nr4] = f3*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uy+uEPy));
//dist[nr4] = f3*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uy+uEPy)+8.0*(uy+uEPy)*(uy+uEPy)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 4
dist[nr3] = f4*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uy+uEPy));
//dist[nr3] = f4*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uy+uEPy)+8.0*(uy+uEPy)*(uy+uEPy)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 5
dist[nr6] = f5*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uz+uEPz));
//dist[nr6] = f5*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uz+uEPz)+8.0*(uz+uEPz)*(uz+uEPz)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 6
dist[nr5] = f6*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uz+uEPz));
//dist[nr5] = f6*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uz+uEPz)+8.0*(uz+uEPz)*(uz+uEPz)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion(double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective, double *FluxElectrical, double *Velocity, double *ElectricField,
double Di, int zi, double rlx, double Vt, int start, int finish, int Np){
int n;
double Ci;
double ux,uy,uz;
double uEPx,uEPy,uEPz;//electrochemical induced velocity
double Ex,Ey,Ez;//electrical field
double flux_diffusive_x,flux_diffusive_y,flux_diffusive_z;
double f0,f1,f2,f3,f4,f5,f6;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ci=Den[n];
Ex=ElectricField[n+0*Np];
Ey=ElectricField[n+1*Np];
Ez=ElectricField[n+2*Np];
ux=Velocity[n+0*Np];
uy=Velocity[n+1*Np];
uz=Velocity[n+2*Np];
uEPx=zi*Di/Vt*Ex;
uEPy=zi*Di/Vt*Ey;
uEPz=zi*Di/Vt*Ez;
f0 = dist[n];
f1 = dist[2*Np+n];
f2 = dist[1*Np+n];
f3 = dist[4*Np+n];
f4 = dist[3*Np+n];
f5 = dist[6*Np+n];
f6 = dist[5*Np+n];
// compute diffusive flux
flux_diffusive_x = (1.0-0.5*rlx)*((f1-f2)-ux*Ci);
flux_diffusive_y = (1.0-0.5*rlx)*((f3-f4)-uy*Ci);
flux_diffusive_z = (1.0-0.5*rlx)*((f5-f6)-uz*Ci);
FluxDiffusive[n+0*Np] = flux_diffusive_x;
FluxDiffusive[n+1*Np] = flux_diffusive_y;
FluxDiffusive[n+2*Np] = flux_diffusive_z;
FluxAdvective[n+0*Np] = ux*Ci;
FluxAdvective[n+1*Np] = uy*Ci;
FluxAdvective[n+2*Np] = uz*Ci;
FluxElectrical[n+0*Np] = uEPx*Ci;
FluxElectrical[n+1*Np] = uEPy*Ci;
FluxElectrical[n+2*Np] = uEPz*Ci;
// q=0
dist[n] = f0*(1.0-rlx)+rlx*0.25*Ci;
//dist[n] = f0*(1.0-rlx)+rlx*0.25*Ci*(1.0 - 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 1
dist[1*Np+n] = f1*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(ux+uEPx));
//dist[1*Np+n] = f1*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(ux+uEPx)+8.0*(ux+uEPx)*(ux+uEPx)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q=2
dist[2*Np+n] = f2*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(ux+uEPx));
//dist[2*Np+n] = f2*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(ux+uEPx)+8.0*(ux+uEPx)*(ux+uEPx)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 3
dist[3*Np+n] = f3*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uy+uEPy));
//dist[3*Np+n] = f3*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uy+uEPy)+8.0*(uy+uEPy)*(uy+uEPy)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 4
dist[4*Np+n] = f4*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uy+uEPy));
//dist[4*Np+n] = f4*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uy+uEPy)+8.0*(uy+uEPy)*(uy+uEPy)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 5
dist[5*Np+n] = f5*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uz+uEPz));
//dist[5*Np+n] = f5*(1.0-rlx) + rlx*0.125*Ci*(1.0+4.0*(uz+uEPz)+8.0*(uz+uEPz)*(uz+uEPz)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
// q = 6
dist[6*Np+n] = f6*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uz+uEPz));
//dist[6*Np+n] = f6*(1.0-rlx) + rlx*0.125*Ci*(1.0-4.0*(uz+uEPz)+8.0*(uz+uEPz)*(uz+uEPz)- 2.0*((ux+uEPx)*(ux+uEPx) + (uy+uEPy)*(uy+uEPy) + (uz+uEPz)*(uz+uEPz)));
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Ion_Init(double *dist, double *Den, double DenInit, int Np){
int n;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (n<Np) {
dist[0*Np+n] = 0.25*DenInit;
dist[1*Np+n] = 0.125*DenInit;
dist[2*Np+n] = 0.125*DenInit;
dist[3*Np+n] = 0.125*DenInit;
dist[4*Np+n] = 0.125*DenInit;
dist[5*Np+n] = 0.125*DenInit;
dist[6*Np+n] = 0.125*DenInit;
Den[n] = DenInit;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Ion_Init_FromFile(double *dist, double *Den, int Np){
int n;
double DenInit;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (n<Np) {
DenInit = Den[n];
dist[0*Np+n] = 0.25*DenInit;
dist[1*Np+n] = 0.125*DenInit;
dist[2*Np+n] = 0.125*DenInit;
dist[3*Np+n] = 0.125*DenInit;
dist[4*Np+n] = 0.125*DenInit;
dist[5*Np+n] = 0.125*DenInit;
dist[6*Np+n] = 0.125*DenInit;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity, int IonValence, int ion_component, int start, int finish, int Np){
int n;
double Ci;//ion concentration of species i
double CD;//charge density
double CD_tmp;
double F = 96485.0;//Faraday's constant; unit[C/mol]; F=e*Na, where Na is the Avogadro constant
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
Ci = Den[n+ion_component*Np];
CD = ChargeDensity[n];
CD_tmp = F*IonValence*Ci;
ChargeDensity[n] = CD*(ion_component>0) + CD_tmp;
}
}
}
extern "C" void ScaLBL_D3Q7_AAodd_IonConcentration(int *neighborList, double *dist, double *Den, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAodd_IonConcentration<<<NBLOCKS,NTHREADS >>>(neighborList,dist,Den,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_IonConcentration: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_AAeven_IonConcentration(double *dist, double *Den, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAeven_IonConcentration<<<NBLOCKS,NTHREADS >>>(dist,Den,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_IonConcentration: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective, double *FluxElectrical, double *Velocity, double *ElectricField,
double Di, int zi, double rlx, double Vt, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAodd_Ion<<<NBLOCKS,NTHREADS >>>(neighborList,dist,Den,FluxDiffusive,FluxAdvective,FluxElectrical,Velocity,ElectricField,Di,zi,rlx,Vt,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_Ion: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion(double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective, double *FluxElectrical, double *Velocity, double *ElectricField,
double Di, int zi, double rlx, double Vt, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAeven_Ion<<<NBLOCKS,NTHREADS >>>(dist,Den,FluxDiffusive,FluxAdvective,FluxElectrical,Velocity,ElectricField,Di,zi,rlx,Vt,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_Ion: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Ion_Init(double *dist, double *Den, double DenInit, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_Ion_Init<<<NBLOCKS,NTHREADS >>>(dist,Den,DenInit,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_Ion_Init: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Ion_Init_FromFile(double *dist, double *Den, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_Ion_Init_FromFile<<<NBLOCKS,NTHREADS >>>(dist,Den,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_Ion_Init_FromFile: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity, int IonValence, int ion_component, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_Ion_ChargeDensity<<<NBLOCKS,NTHREADS >>>(Den,ChargeDensity,IonValence,ion_component,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_Ion_ChargeDensity: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}

969
hip/Ion.hip Normal file
View File

@@ -0,0 +1,969 @@
#include <stdio.h>
#include <math.h>
#include "hip/hip_runtime.h"
#define NBLOCKS 1024
#define NTHREADS 512
extern "C" void Membrane_D3Q19_Unpack(int q, int *list, int *links, int start, int linkCount,
double *recvbuf, double *dist, int N) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, link;
for (link=0; link<linkCount; link++){
idx = links[start+link];
// Get the value from the list -- note that n is the index is from the send (non-local) process
n = list[start + idx];
// unpack the distribution to the proper location
if (!(n < 0))
dist[q * N + n] = recvbuf[start + idx];
}
}
extern "C" void Membrane_D3Q19_Transport(int q, int *list, int *links, double *coef, int start, int offset,
int linkCount, double *recvbuf, double *dist, int N){
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, link;
double alpha;
for (link=offset; link<linkCount; link++){
idx = list[start+link];
// Get the value from the list -- note that n is the index is from the send (non-local) process
n = list[start + idx];
alpha = coef[start + idx];
// unpack the distribution to the proper location
if (!(n < 0))
dist[q * N + n] = alpha*recvbuf[start + idx];
}
}
__global__ void dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef(int *membrane, int *Map, double *Distance, double *Psi, double *coef,
double Threshold, double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int memLinks, int Nx, int Ny, int Nz, int Np){
int link,iq,ip,nq,np,nqm,npm;
double aq, ap, membranePotential;
/* Interior Links */
int S = memLinks/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
link = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (link < memLinks) {
// inside //outside
aq = MassFractionIn; ap = MassFractionOut;
iq = membrane[2*link]; ip = membrane[2*link+1];
nq = iq%Np; np = ip%Np;
nqm = Map[nq]; npm = Map[np]; // strided layout
/* membrane potential for this link */
membranePotential = Psi[nqm] - Psi[npm];
if (membranePotential > Threshold){
aq = ThresholdMassFractionIn; ap = ThresholdMassFractionOut;
}
/* Save the mass transfer coefficients */
coef[2*link] = aq; coef[2*link+1] = ap;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(
const int Cqx, const int Cqy, int const Cqz,
int *Map, double *Distance, double *Psi, double Threshold,
double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int *d3q7_recvlist, int *d3q7_linkList, double *coef, int start, int nlinks, int count,
const int N, const int Nx, const int Ny, const int Nz) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, nqm, npm, label, i, j, k;
double distanceLocal, distanceNonlocal;
double psiLocal, psiNonlocal, membranePotential;
double ap,aq; // coefficient
/* second enforce custom rule for membrane links */
int S = count/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
idx = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (idx < count) {
n = d3q7_recvlist[idx];
label = d3q7_linkList[idx];
ap = 1.0; // regular streaming rule
aq = 1.0;
if (label > 0 && !(n < 0)){
nqm = Map[n];
distanceLocal = Distance[nqm];
psiLocal = Psi[nqm];
// Get the 3-D indices from the send process
k = nqm/(Nx*Ny); j = (nqm-Nx*Ny*k)/Nx; i = nqm-Nx*Ny*k-Nx*j;
// Streaming link the non-local distribution
i -= Cqx; j -= Cqy; k -= Cqz;
npm = k*Nx*Ny + j*Nx + i;
distanceNonlocal = Distance[npm];
psiNonlocal = Psi[npm];
membranePotential = psiLocal - psiNonlocal;
aq = MassFractionIn;
ap = MassFractionOut;
/* link is inside membrane */
if (distanceLocal > 0.0){
if (membranePotential < Threshold*(-1.0)){
ap = MassFractionIn;
aq = MassFractionOut;
}
else {
ap = ThresholdMassFractionIn;
aq = ThresholdMassFractionOut;
}
}
else if (membranePotential > Threshold){
aq = ThresholdMassFractionIn;
ap = ThresholdMassFractionOut;
}
}
coef[2*idx]=aq;
coef[2*idx+1]=ap;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Membrane_Unpack(int q,
int *d3q7_recvlist, double *recvbuf, int count,
double *dist, int N, double *coef) {
//....................................................................................
// Unack distribution from the recv buffer
// Distribution q matche Cqx, Cqy, Cqz
// swap rule means that the distributions in recvbuf are OPPOSITE of q
// dist may be even or odd distributions stored by stream layout
//....................................................................................
int n, idx, link;
double fq,fp,fqq,ap,aq; // coefficient
/* second enforce custom rule for membrane links */
int S = count/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
idx = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (idx < count){
n = d3q7_recvlist[idx];
// update link based on mass transfer coefficients
if (!(n < 0)){
aq = coef[2*idx];
ap = coef[2*idx+1];
fq = dist[q * N + n];
fp = recvbuf[idx];
fqq = (1-aq)*fq+ap*fp;
dist[q * N + n] = fqq;
}
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Membrane_IonTransport(int *membrane, double *coef,
double *dist, double *Den, int memLinks, int Np){
int link,iq,ip,nq,np;
double aq, ap, fq, fp, fqq, fpp, Cq, Cp;
int S = memLinks/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
link = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (link < memLinks){
// inside //outside
aq = coef[2*link]; ap = coef[2*link+1];
iq = membrane[2*link]; ip = membrane[2*link+1];
nq = iq%Np; np = ip%Np;
fq = dist[iq]; fp = dist[ip];
fqq = (1-aq)*fq+ap*fp; fpp = (1-ap)*fp+aq*fq;
Cq = Den[nq]; Cp = Den[np];
Cq += fqq - fq; Cp += fpp - fp;
Den[nq] = Cq; Den[np] = Cp;
dist[iq] = fqq; dist[ip] = fpp;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_IonConcentration(int *neighborList, double *dist, double *Den, int start, int finish, int Np){
int n,nread;
double fq,Ci;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
// q=0
fq = dist[n];
Ci = fq;
// q=1
nread = neighborList[n];
fq = dist[nread];
Ci += fq;
// q=2
nread = neighborList[n+Np];
fq = dist[nread];
Ci += fq;
// q=3
nread = neighborList[n+2*Np];
fq = dist[nread];
Ci += fq;
// q=4
nread = neighborList[n+3*Np];
fq = dist[nread];
Ci += fq;
// q=5
nread = neighborList[n+4*Np];
fq = dist[nread];
Ci += fq;
// q=6
nread = neighborList[n+5*Np];
fq = dist[nread];
Ci += fq;
Den[n]=Ci;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_IonConcentration(double *dist, double *Den, int start, int finish, int Np){
int n;
double fq,Ci;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
// q=0
fq = dist[n];
Ci = fq;
// q=1
fq = dist[2*Np+n];
Ci += fq;
// q=2
fq = dist[1*Np+n];
Ci += fq;
// q=3
fq = dist[4*Np+n];
Ci += fq;
// q=4
fq = dist[3*Np+n];
Ci += fq;
// q=5
fq = dist[6*Np+n];
Ci += fq;
// q=6
fq = dist[5*Np+n];
Ci += fq;
Den[n]=Ci;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective, double *FluxElectrical, double *Velocity, double *ElectricField,
double Di, int zi, double rlx, double Vt, int start, int finish, int Np){
int n;
double Ci;
double ux,uy,uz;
double uEPx,uEPy,uEPz;//electrochemical induced velocity
double Ex,Ey,Ez;//electrical field
double flux_diffusive_x,flux_diffusive_y,flux_diffusive_z;
double f0,f1,f2,f3,f4,f5,f6;
double X,Y,Z,factor_x,factor_y,factor_z;
int nr1,nr2,nr3,nr4,nr5,nr6;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
// q=2
nr2 = neighborList[n + Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n + 2 * Np]; // neighbor 4
f3 = dist[nr3];
// q=4
nr4 = neighborList[n + 3 * Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n + 4 * Np];
f5 = dist[nr5];
// q=6
nr6 = neighborList[n + 5 * Np];
f6 = dist[nr6];
// compute diffusive flux
Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
X = 4.0 * (ux + uEPx);
Y = 4.0 * (uy + uEPy);
Z = 4.0 * (uz + uEPz);
factor_x = X / sqrt(1 + X*X);
factor_y = Y / sqrt(1 + Y*Y);
factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[nr2] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
//f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// q=2
dist[nr1] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
//f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// q = 3
dist[nr4] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y );
//f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// q = 4
dist[nr3] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
//f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// q = 5
dist[nr6] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
//f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// q = 6
dist[nr5] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion(double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective, double *FluxElectrical, double *Velocity, double *ElectricField,
double Di, int zi, double rlx, double Vt, int start, int finish, int Np){
int n;
double Ci;
double ux,uy,uz;
double uEPx,uEPy,uEPz;//electrochemical induced velocity
double Ex,Ey,Ez;//electrical field
double flux_diffusive_x,flux_diffusive_y,flux_diffusive_z;
double f0,f1,f2,f3,f4,f5,f6;
double X,Y,Z,factor_x,factor_y,factor_z;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
//Ci = Den[n];
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
// compute diffusive flux
Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
X = 4.0 * (ux + uEPx);
Y = 4.0 * (uy + uEPy);
Z = 4.0 * (uz + uEPz);
factor_x = X / sqrt(1 + X*X);
factor_y = Y / sqrt(1 + Y*Y);
factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[1 * Np + n] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
//f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// q=2
dist[2 * Np + n] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
//f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// q = 3
dist[3 * Np + n] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y);
//f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// q = 4
dist[4 * Np + n] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
//f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// q = 5
dist[5 * Np + n] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
//f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// q = 6
dist[6 * Np + n] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
//f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Ion_Init(double *dist, double *Den, double DenInit, int Np){
int n;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (n<Np) {
dist[0*Np+n] = 0.25*DenInit;
dist[1*Np+n] = 0.125*DenInit;
dist[2*Np+n] = 0.125*DenInit;
dist[3*Np+n] = 0.125*DenInit;
dist[4*Np+n] = 0.125*DenInit;
dist[5*Np+n] = 0.125*DenInit;
dist[6*Np+n] = 0.125*DenInit;
Den[n] = DenInit;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Ion_Init_FromFile(double *dist, double *Den, int Np){
int n;
double DenInit;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x;
if (n<Np) {
DenInit = Den[n];
dist[0*Np+n] = 0.25*DenInit;
dist[1*Np+n] = 0.125*DenInit;
dist[2*Np+n] = 0.125*DenInit;
dist[3*Np+n] = 0.125*DenInit;
dist[4*Np+n] = 0.125*DenInit;
dist[5*Np+n] = 0.125*DenInit;
dist[6*Np+n] = 0.125*DenInit;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity, double IonValence, int ion_component, int start, int finish, int Np){
int n;
double Ci;//ion concentration of species i
double CD;//charge density
double CD_tmp;
double F = 96485.0;//Faraday's constant; unit[C/mol]; F=e*Na, where Na is the Avogadro constant
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
Ci = Den[n+ion_component*Np];
CD = ChargeDensity[n];
if (ion_component == 0) CD=0.0;
CD_tmp = F*IonValence*Ci;
ChargeDensity[n] = CD + CD_tmp;
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAodd_Ion_v0(int *neighborList, double *dist,
double *Den, double *FluxDiffusive,
double *FluxAdvective,
double *FluxElectrical, double *Velocity,
double *ElectricField, double Di, int zi,
double rlx, double Vt, int start,
int finish, int Np) {
int n;
double Ci;
double ux, uy, uz;
double uEPx, uEPy, uEPz; //electrochemical induced velocity
double Ex, Ey, Ez; //electrical field
double flux_diffusive_x, flux_diffusive_y, flux_diffusive_z;
double f0, f1, f2, f3, f4, f5, f6;
//double X,Y,Z,factor_x, factor_y, factor_z;
int nr1, nr2, nr3, nr4, nr5, nr6;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ci = Den[n];
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
// q=0
f0 = dist[n];
// q=1
nr1 = neighborList[n]; // neighbor 2 ( > 10Np => odd part of dist)
f1 = dist[nr1]; // reading the f1 data into register fq
// q=2
nr2 = neighborList[n + Np]; // neighbor 1 ( < 10Np => even part of dist)
f2 = dist[nr2]; // reading the f2 data into register fq
// q=3
nr3 = neighborList[n + 2 * Np]; // neighbor 4
f3 = dist[nr3];
// q=4
nr4 = neighborList[n + 3 * Np]; // neighbor 3
f4 = dist[nr4];
// q=5
nr5 = neighborList[n + 4 * Np];
f5 = dist[nr5];
// q=6
nr6 = neighborList[n + 5 * Np];
f6 = dist[nr6];
// compute diffusive flux
//Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
//Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
//X = 4.0 * (ux + uEPx);
//Y = 4.0 * (uy + uEPy);
//Z = 4.0 * (uz + uEPz);
//factor_x = X / sqrt(1 + X*X);
//factor_y = Y / sqrt(1 + Y*Y);
//factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[nr2] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
// q=2
dist[nr1] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
// q = 3
dist[nr4] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y );
// q = 4
dist[nr3] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
// q = 5
dist[nr6] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
// q = 6
dist[nr5] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
// f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
}
__global__ void dvc_ScaLBL_D3Q7_AAeven_Ion_v0(
double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective,
double *FluxElectrical, double *Velocity, double *ElectricField, double Di,
int zi, double rlx, double Vt, int start, int finish, int Np) {
int n;
double Ci;
double ux, uy, uz;
double uEPx, uEPy, uEPz; //electrochemical induced velocity
double Ex, Ey, Ez; //electrical field
double flux_diffusive_x, flux_diffusive_y, flux_diffusive_z;
double f0, f1, f2, f3, f4, f5, f6;
//double X,Y,Z, factor_x, factor_y, factor_z;
int S = Np/NBLOCKS/NTHREADS + 1;
for (int s=0; s<S; s++){
//........Get 1-D index for this thread....................
n = S*blockIdx.x*blockDim.x + s*blockDim.x + threadIdx.x + start;
if (n<finish) {
//Load data
Ci = Den[n];
Ex = ElectricField[n + 0 * Np];
Ey = ElectricField[n + 1 * Np];
Ez = ElectricField[n + 2 * Np];
ux = Velocity[n + 0 * Np];
uy = Velocity[n + 1 * Np];
uz = Velocity[n + 2 * Np];
uEPx = zi * Di / Vt * Ex;
uEPy = zi * Di / Vt * Ey;
uEPz = zi * Di / Vt * Ez;
f0 = dist[n];
f1 = dist[2 * Np + n];
f2 = dist[1 * Np + n];
f3 = dist[4 * Np + n];
f4 = dist[3 * Np + n];
f5 = dist[6 * Np + n];
f6 = dist[5 * Np + n];
// compute diffusive flux
//Ci = f0 + f1 + f2 + f3 + f4 + f5 + f6;
flux_diffusive_x = (1.0 - 0.5 * rlx) * ((f1 - f2) - ux * Ci);
flux_diffusive_y = (1.0 - 0.5 * rlx) * ((f3 - f4) - uy * Ci);
flux_diffusive_z = (1.0 - 0.5 * rlx) * ((f5 - f6) - uz * Ci);
FluxDiffusive[n + 0 * Np] = flux_diffusive_x;
FluxDiffusive[n + 1 * Np] = flux_diffusive_y;
FluxDiffusive[n + 2 * Np] = flux_diffusive_z;
FluxAdvective[n + 0 * Np] = ux * Ci;
FluxAdvective[n + 1 * Np] = uy * Ci;
FluxAdvective[n + 2 * Np] = uz * Ci;
FluxElectrical[n + 0 * Np] = uEPx * Ci;
FluxElectrical[n + 1 * Np] = uEPy * Ci;
FluxElectrical[n + 2 * Np] = uEPz * Ci;
//Den[n] = Ci;
/* use logistic function to prevent negative distributions*/
//X = 4.0 * (ux + uEPx);
//Y = 4.0 * (uy + uEPy);
//Z = 4.0 * (uz + uEPz);
//factor_x = X / sqrt(1 + X*X);
//factor_y = Y / sqrt(1 + Y*Y);
//factor_z = Z / sqrt(1 + Z*Z);
// q=0
dist[n] = f0 * (1.0 - rlx) + rlx * 0.25 * Ci;
// q = 1
dist[1 * Np + n] =
f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (ux + uEPx));
// f1 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_x);
// q=2
dist[2 * Np + n] =
f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (ux + uEPx));
// f2 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_x);
// q = 3
dist[3 * Np + n] =
f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uy + uEPy));
// f3 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_y);
// q = 4
dist[4 * Np + n] =
f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uy + uEPy));
// f4 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_y);
// q = 5
dist[5 * Np + n] =
f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + 4.0 * (uz + uEPz));
// f5 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 + factor_z);
// q = 6
dist[6 * Np + n] =
f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - 4.0 * (uz + uEPz));
// f6 * (1.0 - rlx) + rlx * 0.125 * Ci * (1.0 - factor_z);
}
}
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion_v0(
double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective,
double *FluxElectrical, double *Velocity, double *ElectricField, double Di,
int zi, double rlx, double Vt, int start, int finish, int Np) {
dvc_ScaLBL_D3Q7_AAeven_Ion_v0<<<NBLOCKS,NTHREADS >>>(dist,
Den, FluxDiffusive, FluxAdvective,
FluxElectrical, Velocity,
ElectricField, Di, zi,
rlx, Vt, start, finish, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in dvc_ScaLBL_D3Q7_AAeven_Ion_v0: %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion_v0(int *neighborList, double *dist,
double *Den, double *FluxDiffusive,
double *FluxAdvective,
double *FluxElectrical, double *Velocity,
double *ElectricField, double Di, int zi,
double rlx, double Vt, int start,
int finish, int Np) {
dvc_ScaLBL_D3Q7_AAodd_Ion_v0<<<NBLOCKS,NTHREADS >>>(neighborList, dist,
Den, FluxDiffusive, FluxAdvective,
FluxElectrical, Velocity,
ElectricField, Di, zi,
rlx, Vt, start,
finish, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in dvc_ScaLBL_D3Q7_AAodd_Ion_v0: %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_AAodd_IonConcentration(int *neighborList, double *dist, double *Den, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAodd_IonConcentration<<<NBLOCKS,NTHREADS >>>(neighborList,dist,Den,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_IonConcentration: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_AAeven_IonConcentration(double *dist, double *Den, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAeven_IonConcentration<<<NBLOCKS,NTHREADS >>>(dist,Den,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_IonConcentration: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_AAodd_Ion(int *neighborList, double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective, double *FluxElectrical, double *Velocity, double *ElectricField,
double Di, int zi, double rlx, double Vt, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAodd_Ion<<<NBLOCKS,NTHREADS >>>(neighborList,dist,Den,FluxDiffusive,FluxAdvective,FluxElectrical,Velocity,ElectricField,Di,zi,rlx,Vt,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAodd_Ion: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_AAeven_Ion(double *dist, double *Den, double *FluxDiffusive, double *FluxAdvective, double *FluxElectrical, double *Velocity, double *ElectricField,
double Di, int zi, double rlx, double Vt, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_AAeven_Ion<<<NBLOCKS,NTHREADS >>>(dist,Den,FluxDiffusive,FluxAdvective,FluxElectrical,Velocity,ElectricField,Di,zi,rlx,Vt,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_AAeven_Ion: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Ion_Init(double *dist, double *Den, double DenInit, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_Ion_Init<<<NBLOCKS,NTHREADS >>>(dist,Den,DenInit,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_Ion_Init: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Ion_Init_FromFile(double *dist, double *Den, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_Ion_Init_FromFile<<<NBLOCKS,NTHREADS >>>(dist,Den,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_Ion_Init_FromFile: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Ion_ChargeDensity(double *Den, double *ChargeDensity, double IonValence, int ion_component, int start, int finish, int Np){
//cudaProfilerStart();
dvc_ScaLBL_D3Q7_Ion_ChargeDensity<<<NBLOCKS,NTHREADS >>>(Den,ChargeDensity,IonValence,ion_component,start,finish,Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("hip error in ScaLBL_D3Q7_Ion_ChargeDensity: %s \n",hipGetErrorString(err));
}
//cudaProfilerStop();
}
extern "C" void ScaLBL_D3Q7_Membrane_AssignLinkCoef(int *membrane, int *Map, double *Distance, double *Psi, double *coef,
double Threshold, double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int memLinks, int Nx, int Ny, int Nz, int Np){
dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef<<<NBLOCKS,NTHREADS >>>(membrane, Map, Distance, Psi, coef,
Threshold, MassFractionIn, MassFractionOut, ThresholdMassFractionIn, ThresholdMassFractionOut,
memLinks, Nx, Ny, Nz, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef: %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo(
const int Cqx, const int Cqy, int const Cqz,
int *Map, double *Distance, double *Psi, double Threshold,
double MassFractionIn, double MassFractionOut, double ThresholdMassFractionIn, double ThresholdMassFractionOut,
int *d3q7_recvlist, int *d3q7_linkList, double *coef, int start, int nlinks, int count,
const int N, const int Nx, const int Ny, const int Nz) {
dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo<<<NBLOCKS,NTHREADS>>>(
Cqx, Cqy, Cqz, Map, Distance, Psi, Threshold,
MassFractionIn, MassFractionOut, ThresholdMassFractionIn, ThresholdMassFractionOut,
d3q7_recvlist, d3q7_linkList, coef, start, nlinks, count, N, Nx, Ny, Nz);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q7_Membrane_AssignLinkCoef_halo: %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_Membrane_Unpack(int q,
int *d3q7_recvlist, double *recvbuf, int count,
double *dist, int N, double *coef){
dvc_ScaLBL_D3Q7_Membrane_Unpack<<<NBLOCKS,NTHREADS >>>(q, d3q7_recvlist, recvbuf,count,
dist, N, coef);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q7_Membrane_Unpack: %s \n",hipGetErrorString(err));
}
}
extern "C" void ScaLBL_D3Q7_Membrane_IonTransport(int *membrane, double *coef,
double *dist, double *Den, int memLinks, int Np){
dvc_ScaLBL_D3Q7_Membrane_IonTransport<<<NBLOCKS,NTHREADS >>>(membrane, coef, dist, Den, memLinks, Np);
hipError_t err = hipGetLastError();
if (hipSuccess != err){
printf("CUDA error in dvc_ScaLBL_D3Q7_Membrane_IonTransport: %s \n",hipGetErrorString(err));
}
}

Some files were not shown because too many files have changed in this diff Show More