Remove parallel try-catch from prepareTimeStep because of nestedness.

Although not declared as such, prepareTimeStep seems to be an internal function (despite usage in a test) and hence error control can be done in code calling it. There was the following problem with the try-catch approach taken: The calling site `BlackoilWellModel::assemble` looked like this: ``` OPM_BEGIN_PARALLEL_TRY_CATCH(); { if (iterationIdx == 0) { calculateExplicitQuantities(local_deferredLogger); // no parallel try-catch prepareTimeStep(local_deferredLogger); //includes parallel try-catch } updateWellControls(local_deferredLogger, /* check group controls */ true); // Set the well primary variables based on the value of well solutions initPrimaryVariablesEvaluation(); maybeDoGasLiftOptimize(local_deferredLogger); assembleWellEq(dt, local_deferredLogger); } OPM_END_PARALLEL_TRY_CATCH_LOG(local_deferredLogger, "assemble() failed: ", terminal_output_); ``` calculateExplicitQuantities had no parallel-try-catch clause inside, but prepareTimeStep had one. Unfortunately, calculateExplicitQuantities might throw (on some processors). In that case non-throwing processors will try to trigger a collective communication (to check for errors) in prepareTimeStep. While the one throwing will move to the OPM_END_PARALLEL_TRY_CATCH_LOG macro at the end and also trigger a different collective communication. Booom, we have a deadlock. With this patch there is no (nested parallel)-try-catch clause in the functions called. (And if an exception is thrown in prepareTimeStep, it will be logged as being an assemble failure). The other option would have been to add parallel-try-catch clauses to all functions called. That would have created a lot more synchronization points limiting scalability even further.
2025-02-25 18:55:30 -06:00 · 2021-09-23 15:22:48 +02:00 · 2021-09-23 15:22:48 +02:00 · f64230e462
commit f64230e462
parent 3cda8a2fdb
1 changed files with 30 additions and 35 deletions
--- a/opm/simulators/wells/BlackoilWellModel_impl.hpp
+++ b/opm/simulators/wells/BlackoilWellModel_impl.hpp
@ -1405,48 +1405,43 @@ namespace Opm {
    BlackoilWellModel<TypeTag>::
    prepareTimeStep(DeferredLogger& deferred_logger)
    {
-        OPM_BEGIN_PARALLEL_TRY_CATCH();
-        {
-            for (const auto& well : well_container_) {
-                const bool old_well_operable = well->isOperable();
-                well->checkWellOperability(ebosSimulator_, this->wellState(), deferred_logger);
+        for (const auto& well : well_container_) {
+            const bool old_well_operable = well->isOperable();
+            well->checkWellOperability(ebosSimulator_, this->wellState(), deferred_logger);

-                if (!well->isOperable() ) continue;
+            if (!well->isOperable() ) continue;

-                auto& events = this->wellState().well(well->indexOfWell()).events;
-                if (events.hasEvent(WellState::event_mask)) {
-                    well->updateWellStateWithTarget(ebosSimulator_, this->groupState(), this->wellState(), deferred_logger);
-                    // There is no new well control change input within a report step,
-                    // so next time step, the well does not consider to have effective events anymore.
-                    events.clearEvent(WellState::event_mask);
-                }
+            auto& events = this->wellState().well(well->indexOfWell()).events;
+            if (events.hasEvent(WellState::event_mask)) {
+                well->updateWellStateWithTarget(ebosSimulator_, this->groupState(), this->wellState(), deferred_logger);
+                // There is no new well control change input within a report step,
+                // so next time step, the well does not consider to have effective events anymore.
+                events.clearEvent(WellState::event_mask);
+            }

-                // solve the well equation initially to improve the initial solution of the well model
-                if (param_.solve_welleq_initially_) {
-                    well->solveWellEquation(ebosSimulator_, this->wellState(), this->groupState(), deferred_logger);
-                }
+            // solve the well equation initially to improve the initial solution of the well model
+            if (param_.solve_welleq_initially_) {
+                well->solveWellEquation(ebosSimulator_, this->wellState(), this->groupState(), deferred_logger);
+            }

-                const bool well_operable = well->isOperable();
-                if (!well_operable && old_well_operable) {
-                    const Well& well_ecl = getWellEcl(well->name());
-                    if (well_ecl.getAutomaticShutIn()) {
-                        deferred_logger.info(" well " + well->name() + " gets SHUT at the beginning of the time step ");
-                    } else {
-                        if (!well->wellIsStopped()) {
-                            deferred_logger.info(" well " + well->name() + " gets STOPPED at the beginning of the time step ");
-                            well->stopWell();
-                        }
+            const bool well_operable = well->isOperable();
+            if (!well_operable && old_well_operable) {
+                const Well& well_ecl = getWellEcl(well->name());
+                if (well_ecl.getAutomaticShutIn()) {
+                    deferred_logger.info(" well " + well->name() + " gets SHUT at the beginning of the time step ");
+                } else {
+                    if (!well->wellIsStopped()) {
+                        deferred_logger.info(" well " + well->name() + " gets STOPPED at the beginning of the time step ");
+                        well->stopWell();
                    }
-                } else if (well_operable && !old_well_operable) {
-                    deferred_logger.info(" well " + well->name() + " gets REVIVED at the beginning of the time step ");
-                    well->openWell();
                }
+            } else if (well_operable && !old_well_operable) {
+                deferred_logger.info(" well " + well->name() + " gets REVIVED at the beginning of the time step ");
+                well->openWell();
+            }

-             }  // end of for (const auto& well : well_container_)
-            updatePrimaryVariables(deferred_logger);
-        }
-        OPM_END_PARALLEL_TRY_CATCH_LOG(deferred_logger, "prepareTimestep() failed: ",
-                                       terminal_output_);
+        }  // end of for (const auto& well : well_container_)
+        updatePrimaryVariables(deferred_logger);
    }