Thu Sep 27 12:28:48 PDT 2012
- Previous message: [Slony1-general] Issue when adding node to replication
- Next message: [Slony1-general] Issue when adding node to replication
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 09/27/2012 01:26 PM, Jan Wieck wrote: > On 9/27/2012 2:34 PM, Brian Fehrle wrote: >> Hi all, >> >> PostgreSQL v 9.1.5 - 9.1.6 >> Slony version 2.1.0 >> >> I'm having an issue that's occurred twice now. I have 4 node slony >> cluster, and one of the operations is to drop a node from replication, >> do maintenance on it, then add it back to replication. >> >> Node 1 = master >> Node 2 = slave >> Node 3 = slave -> dropped then readded >> Node 4 = slave > > First, why is the node actually dropped and readded so fast, instead > of just doing the maintenance while it falls behind, then let it catch > up? > We have several cases where it makes sense, such as re-installing the OS or in todays case, we replaced the physical machine with a new one. > You apparently have a full blown path network from everyone to > everyone. This is not good under normal circumstances since the > automatic listen generation will cause every node to listen on every > other node for events, from non-origins. Way too many useless database > connections. From my understanding, without this set-up, all events must then be passed through the master node to relay it. So master node = 1, slave = 2 and 3, 3 must communicate with 2, and without direct access it will relay through the master. Is this understanding wrong? > > What seems to happen here are some race conditions. The node is > dropped and when it is added back again, some third node still didn't > process the DROP NODE and when node 4 looks for events from node 3, it > finds old ones somewhere else (like on 1 or 2). When node 3 then comes > around to use those event IDs again, you get the dupkey error. > > What you could do if you really need to drop/readd it, use an explicit > WAIT FOR EVENT for the DROP NODE to make sure all traces of that node > are gone from the whole cluster. > Ok, I'll look into implementing that. Another thought was to issue a cleanupEvent() on each of the nodes still attached to replication after I do the dump. Thanks - Brian F > > Jan >
- Previous message: [Slony1-general] Issue when adding node to replication
- Next message: [Slony1-general] Issue when adding node to replication
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list