Wed May 28 08:51:00 PDT 2008
- Previous message: [Slony1-hackers] Correctly checking to see if there is a live slon...
- Next message: [Slony1-hackers] Correctly checking to see if there is a live slon...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Christopher Browne <cbbrowne at ca.afilias.info> writes: > Unfortunately, pg_stat_activity pulls the PIDs from the stats > collector, so that there is a delay in changes being reported. (And I > have seen situations where the stats collector got "blown," in which > case, this wouldn't report anything even *nearly* correct.) Actually, forget the concern: pg_stat_activity *is* good enough, for this purpose. The *OTHER* reference to pg_listener is a little bit later in the same function, and it takes place in the context of the following sort of loop: for each node get the PID of the slon [we run failednode() against each of the nodes...] while not done for each node make sure the slon PID has changed from the one found the first time [Subtext to all of this: If the slon was "dead" during any of that, then having _no_ PID behaves rather like NULL, where NULL <> NULL, and the loop can terminate successfully.] It is fine for these queries to be done based on statistical records in pg_stat_activity, as there are the following possibilities: 1. If the stats are up to date, then all is well. 2. If the stats are falling behind, then we may loop extra times in the "while not done" portion of the logic, which is OK. 3. If the stats collector is broken altogether, then this will loop perpetually, until the user does something to (say) restart the offending database, which would certainly rectify the situation. In any case, using the stats collector provides *consistent* results for all of these scenarios, so it is perfectly fine to use the previously-suggested query joining pg_nodelock with pg_stat_activity. I think I'm inclined to add logic to the loop (that isn't there now) to report at least *something* back if it's encountering problems. I'm thinking that after 10 iterations, it should start reporting which nodes it is failing to see restarted. -- (reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc")) http://www3.sympatico.ca/cbbrowne/rdbms.html Twice five syllables Plus seven can't say much but That's haiku for you.
- Previous message: [Slony1-hackers] Correctly checking to see if there is a live slon...
- Next message: [Slony1-hackers] Correctly checking to see if there is a live slon...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-hackers mailing list