This commit adds support for loading a three-column NLDD
partitioning scheme of the form
MPI_Rank Cartesian_Index NLDD_Domain_ID
from a text file. In an MPI run it is assumed that the first column
holds integers in the range 0..MPI_Size()-1, and typically that each
such integer is listed at least once. In a sequential run, the MPI
rank column is ignored.
With this scheme we can load the same partition files that we write,
for increased repeatability and determinism, and we can also
experiment with externally generated NLDD partitions.
This commit adds a new (hidden) debugging option,
DebugEmitCellPartition (--debug-emit-cell-partition)
which, when set, will cause each rank to write a three-column text
file of the form
MPI_Rank Cartesian_Index NLDD_Domain_ID
into the directory
partition/CaseName
of the run's output directory. That file will be named according to
the process' MPI rank, so the first column will be the same as the
file name.
The option is primarily intended for debugging the NLDD partitioning
scheme, so is mostly reserved for runs with low MPI sizes (e.g.,
less than 20).
While here, also make the MPIPartitionFromFile helper class aware of
this format so that we can use concatenated output files as an input
to the MPI partitioning algorithm for repeatability.
This commit introduces new, experimental support for loading a
partitioning of the cells from a text file. The name of the file is
passed into the simulator using the new, hidden, command line option
--external-partition=filename
and we perform some basic checking that the number of elements in the
partition matches the number of cells in the CpGrid object.
This commit switches the current implementation of
'partitionCellsZoltan()', i.e., 'partitionCells("zoltan", ...)' into
using the MPI-aware ParallelNLDDPartitioningZoltan utility. In
doing so we make 'partitionCellsZoltan()' private since its
availability is not guaranteed. We also slightly reorder the
parameters and switch from passing a "Grid" into passing a
"GridView" as an argument to partitionCells(), and specialise this
function for the known grid views in OPM Flow.
We extract the Zoltan-related parameters out to an Entity-dependent
helper structure and move the complexity of forming this type to a
new helper function, BlackoilModelEbosNldd::partitionCells().
This commit adds a new flag data member,
wellStructureChangedDynamically_
to the generic black-oil well model. This flag captures the
well_structure_changed
value from the 'SimulatorUpdate' structure in the updateEclWells()
member function. Then, in BlackoilWellModel::beginTimeStep(), we
key a well structure update off this flag when set. This, in turn,
enables creating or opening wells as a result of an ACTIONX block
updating the structure in the middle of a report step.
The NLDD solver would always take one global non-linear (Newton)
iteration before starting the local non-linear iterations. This
commit introduces a new command-line parameter,
--nldd-num-initial-newton-iter (NlddNumInitialNewtonIter)
which allows the user to configure this value at runtime. The
default value, 1, preserves the current behaviour.
when we do the local solve for well equations, control/status will be
updated during the iteration process, such that the converged well gets
correct control/status regarding to the current reservoir state.
various change in the other parts of the code were made to make the
function work as intended.
Moved damaris command line parameter accessors to Main.cpp and fixed support for both Python and Paraview Python scripts, as they both cannot be present in the same simulation (seems to be an initialization conflict or double initialization)
added access to DUNE mesh geometry and passing through data to Damaris;
Updated command line so users can specifiy Python or Paraview script names and other paramaters that control Damaris
- Simulation name
- Number of dedicated cores or dedicated nodes
- Shared memory region size
- switch to turn off HDF5 output.
- Damaris logging level
Instead of unconditionally issuing MPI_Abort if we encounter a fatal
exception, we try to test whether all processes have experienced this
exception and if this is the case just terminate nomally with a exit
code that signals an error. We still use MPI_Abort if not all
processes get an exception as this is the only way to make sure that
the program aborts.
This approach also works around issues in some MPI implementations
that might not correctly return the error.
Multiple messages like this are gone now:
```
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[smaug.dr-blatt.de:129359] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[smaug.dr-blatt.de:129359] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
```
Bu we still see something like this:
```
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[35057,1],0]
Exit code: 1
--------------------------------------------------------------------------
```
Changes
```
Program threw an exception: [/home/mblatt/src/dune/opm/opm-simulators/opm/simulators/timestepping/AdaptiveTimeSteppingEbos.hpp:586] Solver failed to converge after cutting timestep 11 times.
```
to
```
Simulation aborted: Solver failed to converge after cutting timestep 11 times.
```
Which seems more user friendly.
There is a strange interaction when using MPI and OpenMP on some
hardware/MPI implementations. I a serial run omp_get_num_procs() would
return the number of processors but when started under mpirun it would
always return 1.
With this we now allow users to use any amount of threads.
While we reported that we used the number of threads that were passed
on the command line, we never really used it for OpenMP but always
sticked to two unless environment variable OMP_NUM_THREADS was set.
Note that because the ThreadManager in opm-models would always use the
command line option and hence the linearizer would use that number of
threads.
Please note that the only use of OpenMP in opm-common (volume
calculation in EclipseGrid) is not effected by this as it happens
before we set the number of OpenMP threads.
Following commits f4f8c033d and c7016854d (PR #4286), the "vanguard"
no longer maintains an internal Deck data member. Don't pretend
that it does by providing member functions to access that data
member. If someone tries to actually call these member functions
then they will get an unfriendly diagnostic message and a build
failure.