Thu Jul 31 11:30:49 PDT 2008
- Previous message: [Slony1-general] how to do maintenance on a slave server?
- Next message: [Slony1-general] Strange thing happens after switchover
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Alan Hodgson <ahodgson at simkin.ca> writes: > On Wednesday 30 July 2008, "Stefan Murphy" <stefan at vocalocity.com> wrote: >> After doing this work we found errors in the slave's log. Inserts >> failing because of duplicate values in primary key (non-Slony tables). >> Slony would hang on these errors. > > This statement is confusing. If they're non-Slony tables then how could > replication have any issues with them? Confusing indeed. Slony-I sets up a connection lock so that only 1 slon can be managing a particular node at a time, and then processes everything within tranactions against both provider and subscriber, so it should cope gracefully with any of these failures: If you shut down the [subscriber DB] while the slon is applying changes, you'll have an uncommitted transaction that will be rolled back. No damage done. Indeed, that should characterize things pretty well for a number of sorts of failure modes other than [subscriber DB]. For instance, the statement should continue to be valid if we replace [subscriber DB] with: - [slon process] - [provider DB] And it shouldn't be invalidated by the failure being anywhere in the following set: - Killing the slon process; - Stopping the subscriber backend process by killing it cleanly; - Stopping the subscriber backend process by killing it uncleanly, thereby requiring crash recovery when the postmaster restarts; - Stopping the provider backend process by killing it cleanly; - Stopping the provider backend process by killing it uncleanly, thereby requiring crash recovery when the postmaster restarts. If the provider or subscriber hosts are shut down, there is a possibility of corruption of the filesystem which might invalidate one or another of the databases; Slony-I can't really help with that. We've seen such corruptions emerge from the following sorts of phenomena: - IBM HACMP failover captured requests on the SCSI bus and tried to re-apply them, trashing the filesystem + database; - A gradual power outage might leave some fading signals on the SCSI or fibrechannel bus as the computer "died," leading to the disk array getting phantom writes, trashing the filesystem + database. Do any of these failure modes seem familiar? (e.g. - indicative of what happened here?) -- select 'cbbrowne' || '@' || 'linuxfinances.info'; http://cbbrowne.com/info/lsf.html Rules of the Evil Overlord #145. "My dungeon cell decor will not feature exposed pipes. While they add to the gloomy atmosphere, they are good conductors of vibrations and a lot of prisoners know Morse code." <http://www.eviloverlord.com/>
- Previous message: [Slony1-general] how to do maintenance on a slave server?
- Next message: [Slony1-general] Strange thing happens after switchover
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list