Remove parallel try-catch from prepareTimeStep because of nestedness.

Although not declared as such, prepareTimeStep seems to be an internal
function (despite usage in a test) and hence error control can be done
in code calling it.

There was the following problem with the try-catch approach taken:
The calling site `BlackoilWellModel::assemble` looked like this:

```
OPM_BEGIN_PARALLEL_TRY_CATCH();
{
    if (iterationIdx == 0) {
        calculateExplicitQuantities(local_deferredLogger); // no parallel try-catch
        prepareTimeStep(local_deferredLogger); //includes parallel try-catch
    }
    updateWellControls(local_deferredLogger, /* check group controls */ true);

    // Set the well primary variables based on the value of well solutions
    initPrimaryVariablesEvaluation();

    maybeDoGasLiftOptimize(local_deferredLogger);
    assembleWellEq(dt, local_deferredLogger);
}
OPM_END_PARALLEL_TRY_CATCH_LOG(local_deferredLogger, "assemble() failed: ",
                               terminal_output_);
```

calculateExplicitQuantities had no parallel-try-catch clause inside,
but prepareTimeStep had one.
Unfortunately, calculateExplicitQuantities might throw (on some
processors). In that case non-throwing processors will try to trigger a
collective communication (to check for errors) in
prepareTimeStep. While the one throwing will move to the
OPM_END_PARALLEL_TRY_CATCH_LOG macro at the end and also trigger a different
collective communication. Booom, we have a deadlock.

With this patch there is no (nested parallel)-try-catch clause in the
functions called. (And if an exception is thrown in prepareTimeStep, it
will be logged as being an assemble failure).

The other option would have been to add parallel-try-catch clauses
to all functions called. That would have created a lot more
synchronization points limiting scalability even further.
This commit is contained in:
Markus Blatt 2021-09-23 15:22:48 +02:00
parent 3cda8a2fdb
commit f64230e462

View File

@ -1404,8 +1404,6 @@ namespace Opm {
void
BlackoilWellModel<TypeTag>::
prepareTimeStep(DeferredLogger& deferred_logger)
{
OPM_BEGIN_PARALLEL_TRY_CATCH();
{
for (const auto& well : well_container_) {
const bool old_well_operable = well->isOperable();
@ -1445,9 +1443,6 @@ namespace Opm {
} // end of for (const auto& well : well_container_)
updatePrimaryVariables(deferred_logger);
}
OPM_END_PARALLEL_TRY_CATCH_LOG(deferred_logger, "prepareTimestep() failed: ",
terminal_output_);
}