mirror of
https://github.com/Gnucash/gnucash.git
synced 2025-02-25 18:55:30 -06:00
initial checkin -- error reporting architecture
git-svn-id: svn+ssh://svn.gnucash.org/repo/gnucash/trunk@6417 57a11ea4-9604-0410-9ed3-97b8803252fd
This commit is contained in:
parent
da601a847d
commit
0bfd17442e
189
src/doc/backend-errors.txt
Normal file
189
src/doc/backend-errors.txt
Normal file
@ -0,0 +1,189 @@
|
||||
|
||||
Handling Backend Communications Errors
|
||||
--------------------------------------
|
||||
Architectural Discussion
|
||||
December 2001
|
||||
Proposed/Reviewed, Linas Vepstas, Dave Peticolas
|
||||
|
||||
Problem:
|
||||
--------
|
||||
What to do if a serious error occurs in a backend while
|
||||
GnuCash is being used? For example, what happens if the connection
|
||||
to the SQL server is lost, because the SQL server has died, and/or
|
||||
because there is a network problem (unplugged ethernet cable, etc.)
|
||||
|
||||
|
||||
Discussion:
|
||||
-----------
|
||||
There are a set of macros in the Postgres backend that check for
|
||||
a Postgres error, and completely shut down the connection to the
|
||||
Postgres server whenever even a minor error occurs. This is
|
||||
excessively harsh. How to do better?
|
||||
|
||||
|
||||
The "Handle it Automatically in the Backend" idea:
|
||||
--------------------------------------------------
|
||||
Detect the error in the backend, and do something 'intelligent'
|
||||
in the backend, trying to recover from it. What one does depends on
|
||||
the actual context (depending one what is going on in the code at that
|
||||
point.) In other words, implement automatic session-reconnection in
|
||||
the backend.
|
||||
|
||||
To do this, you can't just handle the errors in the macros (SEND_QUERY,
|
||||
FINISH_QUERY, etc) since it depends on the context and how much work
|
||||
you've sent to the postgres process so far. One error that would
|
||||
be nice to be able to recover from is a simple loss of connection (the
|
||||
postmaster gets killed and restarted). This might require one to
|
||||
'replay' some last few queries,
|
||||
|
||||
|
||||
The "Generic Handler, Report it to the User" idea:
|
||||
--------------------------------------------------
|
||||
There's a simple, direct thing we should get working first:
|
||||
|
||||
Go ahead and close the connection, but then return to the engine
|
||||
in some nice way, let the engine report the error by GUI, and then
|
||||
allow the user to initiaite a new session (or maybe try to do it
|
||||
automatically): and do all this without deleting all the accounts
|
||||
and transactions.
|
||||
|
||||
Its some fair amount of work just to untangle the flow of control
|
||||
for this case, and leave gnucash in a usable state without having
|
||||
an open session.
|
||||
|
||||
I like this for several reasons:
|
||||
-- its generic, it can handle any backend error anywhere in the code.
|
||||
You don't have to second-guess based on whether some recent query
|
||||
may or might not have completed.
|
||||
-- I beleive that reconnect will be quicker, because you won't need
|
||||
reload piles of accounts and transactions.
|
||||
-- If the user can't reconnect, then they can always save to a file.
|
||||
This can be a double bonus if done right: e.g. user works on laptop,
|
||||
saves to file, takes laptop to airport, works off-line, and then
|
||||
syncs her changes back up when she goes on-line again.
|
||||
|
||||
|
||||
Discussion:
|
||||
----------
|
||||
> Should the backend try reconnecting first, or just go ahead and
|
||||
> return an error condition immediately? If the latter, then the
|
||||
> current backend error-handling can just stay as it is and the gui
|
||||
> codes need to add checks in several places, right?
|
||||
|
||||
The backend can try reconnecting automatically. But lets think through
|
||||
what this implies, and we'll see its not that good an idea:
|
||||
|
||||
It will need to remember the user's password to reconnect (It currently
|
||||
drops the passwd as a security precaution). I don't have an opinion
|
||||
as to whether it should log the reconnect in the gncSession table.
|
||||
I don't know if it should try to do a streamlined reconnect -- e.g.
|
||||
skip checking the version numbers ... but maybe the SQL server was
|
||||
rebooted (or at least, all users were kicked) precisely because the
|
||||
version numbers changed ??
|
||||
|
||||
The problem with automatic reconnect from within the backend is that you
|
||||
don't know quite where to restart... or rather, you have trouble getting
|
||||
to the right place to restart. Take for example
|
||||
|
||||
pgendStoreTransaction (PGBackend *be, Transaction *trans)
|
||||
{
|
||||
/* lock it up so that we store atomically */
|
||||
bufp = "BEGIN;\n"
|
||||
"LOCK TABLE gncTransaction IN EXCLUSIVE MODE;\n"
|
||||
"LOCK TABLE gncEntry IN EXCLUSIVE MODE;\n";
|
||||
SEND_QUERY (be,bufp, );
|
||||
FINISH_QUERY(be->connection);
|
||||
|
||||
pgendStoreTransactionNoLock (be, trans, TRUE);
|
||||
|
||||
bufp = "COMMIT;\n"
|
||||
"NOTIFY gncTransaction;";
|
||||
SEND_QUERY (be,bufp, );
|
||||
FINISH_QUERY(be->connection); // << network error occurs here!!!
|
||||
|
||||
Well, you can't just re-login, and reissue the commit. You really need
|
||||
to rewind to the begining of the subroutine. How can you do this?
|
||||
|
||||
Alternative 1) wrap this routine:
|
||||
|
||||
pgendStoreTransaction (PGBackend *be, Transaction *trans)
|
||||
{
|
||||
do {
|
||||
pgendIfNotLoggedInThenReLogin(be);
|
||||
pgendStoreTransactionOnceOnly(be, trans);
|
||||
} while (NO_ERROR ! pgendGetError());
|
||||
}
|
||||
|
||||
well, maybe not infinite loop, maybe three retries or something.
|
||||
|
||||
Alternative 2) throw an error, let some much higher layer catch it.
|
||||
|
||||
Well, approach 1) seems reasonable... until you think about what happens
|
||||
if three retries doesn't cut it: then you have to throw an error
|
||||
anyway, and hope the higher layer deals with it. So even if you
|
||||
implement 1), you *still* have to implement 2) anyway.
|
||||
|
||||
So my attitude is to skip doing 1 for now (maybe we can add it later)
|
||||
and just make sure that when we "throw" the error, it really does behave
|
||||
like a throw should behave, and short-cuts its way up to where its
|
||||
caught. The catcher should probably be a few strategic places in the
|
||||
GUI, like wherever a xaccQuery() is issued, and wherever an
|
||||
xaccTransCommitEdit() is issued (which is hopefully not a lot of
|
||||
places ?).
|
||||
|
||||
|
||||
What's the point of doing 2 cleanly? Because I suspect that most
|
||||
network errors won't be automatically recoverable. Most likely,
|
||||
either someone tripped over an ethernet cable, or the server crashed,
|
||||
and you gotta call the sysadmin on the phone, etc. The goal is not
|
||||
to crash the client when the network is down, but rather let the user
|
||||
continue to work off-line (rather than a forced coffee break).
|
||||
|
||||
Alternately, user might take a forced coffee break, and 10 minutes
|
||||
later, manually reconnects and resumes work ... without having to
|
||||
stop & restart gnucash, without having to close and reopen a register,
|
||||
re-run a report window, etc. Because its the re-opening of the
|
||||
app that is the major pain in the butt.
|
||||
|
||||
|
||||
How to Report Errors to the GUI
|
||||
-------------------------------
|
||||
> How would the engine->GUI error reporting happen? A direct callback?
|
||||
> Or having the GUI always check for session errors?
|
||||
|
||||
We should use the session error mechanism for reporting these errors.
|
||||
Note that the API allows a simple 'try-throw-catch' style error
|
||||
handling in C. Because we don't/can't unwind the stack as a true
|
||||
'throw' would, we need to make sure that when we "throw" the error,
|
||||
it emulates this as best it can: it short-cuts its way up and out of
|
||||
the engine, to where its caught in the GUI. The catcher should probably
|
||||
be a few strategic places in the GUI, like wherever a xaccQuery() is
|
||||
issued, and wherever an xaccTransCommitEdit() is issued.
|
||||
|
||||
Unfortunately, there are a *lot* of places where these calls are
|
||||
issued, and therefore, its a lot of work to modify all of these places
|
||||
to check for an error condition. It would simplify things if there
|
||||
was also a callback medchanism.
|
||||
|
||||
Propose:
|
||||
Maybe gnc-event.h should be extended to generate events for errors
|
||||
as well ...
|
||||
|
||||
How about this idea:
|
||||
|
||||
change gnc_session_push_error() so that it calls
|
||||
gnc_engine_generate_event (GUID_of_session, GNC_EVENT_ERROR)
|
||||
|
||||
The GUI would register a handler; the handler would call
|
||||
gnc_session_get_error() to find out the details of the error; and
|
||||
maybe put a popup on the screen, maybe set some flags so that the
|
||||
GUI starts working differently...
|
||||
|
||||
This would save a *lot* of trouble of having to check the error code
|
||||
in the zillion places where CommitEdit is called. Of course, if the
|
||||
error occurs, then all the code that executes following the CommitEdit
|
||||
is 'suspect', and is potentially buggy/non-robust in the face of that
|
||||
error. Alligators lie here ...
|
||||
|
||||
|
||||
============================== END OF DOCUMENT =====================
|
Loading…
Reference in New Issue
Block a user