Stefan Murphy stefan at vocalocity.com
Wed Jul 30 12:33:07 PDT 2008
Hi all,

Warning novice alert, we've only been using Slony for about a month.

Last night we were doing OS level work on a Slony slave server (Linux).  This involved bouncing the box multiple times, at least once when the box was unresponsive in a load spike from a Postgres issue.  We didn't shut down the slave daemons which were running on this server (in retrospect probably a bad idea).  When the box restarts we have the daemons auto restarting.  

After doing this work we found errors in the slave's log.  Inserts failing because of duplicate values in primary key (non-Slony tables).  Slony would hang on these errors.  Odd in that these were transient tables with sequences as primary keys.  No data stays in these tables on a permanent basis.  We tried removing all rows for these problem tables in Slony logs in master.  We also tried truncating the Slony logs in the master.  We weren't worried about data integrity in this problem period, just that replication worked going forward.  We ended up removing the primary keys from the problem tables in the slave DB just to keep replication working for other critical tables.

We plan an recreating the slave DB and reinitializing replication to fix the problem.

In doing this type of maintenance should we be shutting down the slave daemons? (Yes, I've probably answered my own question, but would like to confirm.)

I'm interesting in your feedback about the things we were trying.  Were we on the right track or being horribly stupid?  :)

Any idea what was happening?  Observationally it was like Slony was trying to do the same transaction multiple times in the salve DB.

Thanks for your help,

Stefan


More information about the Slony1-general mailing list