Wed Apr 30 08:57:34 PDT 2014
- Previous message: [Slony1-general] Slony 2.2 failover changes
- Next message: [Slony1-general] Slony 2.2 failover changes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 04/28/2014 09:56 AM, Glyn Astill wrote: This is sounding like a bug, what is your path network? Do you have paths between all nodes in both directions or something else? Does it happen everytime you test or only sometimes? > Hi All, > > I'm testing the changes to failover in 2.2.2 and seem to be running into > issues passing multiple nodes to failover. In the following scenario > with 4 nodes, node 2 is the origin of all sets and node 3 is a > forwarding provider to node 4, i.e. > > 1 <---- 2 ----> 3 ----> 4 > > I'm attempting to fail over in a scenario where both nodes 2 and 3 have > failed, so postgres is stopped for both of those nodes. I'm running the > following script: > > CLUSTER NAME = test_replication; > NODE 1 ADMIN CONNINFO = 'dbname=TEST host=localhost port=5432 user=slony'; > NODE 2 ADMIN CONNINFO = 'dbname=TEST host=localhost port=5433 user=slony'; > NODE 3 ADMIN CONNINFO = 'dbname=TEST host=localhost port=5434 user=slony'; > NODE 4 ADMIN CONNINFO = 'dbname=TEST host=localhost port=5435 user=slony'; > FAILOVER ( > NODE = (ID = 2, BACKUP NODE = 1), > NODE = (ID = 3, BACKUP NODE = 1) > ); > > However it would appear that slonik will wait indefinitely for node 4 to > catch up via failed node 3: > > $ slonik test.scr > test.scr:3: could not connect to server: Connection refused > Is the server running on host "localhost" (127.0.0.1) and accepting > TCP/IP connections on port 5433? > test.scr:4: could not connect to server: Connection refused > Is the server running on host "localhost" (127.0.0.1) and accepting > TCP/IP connections on port 5434? > executing preFailover(2,1) on 1 > NOTICE: executing "_test_replication".failedNode2 on node 1 > test.scr:6: NOTICE: calling restart node 2 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > test.scr:6: waiting for event (1,5000000157). node 4 only on event > 5000000156 > > It'll only complete if I bring node 3 back up, which of course I > couldn't do if it was really dead: > > NOTICE: executing "_test_replication".failedNode3 on node 1 > > Have I totally got the wrong end of the stick here? > > Thanks > Glyn > > > _______________________________________________ > Slony1-general mailing list > Slony1-general at lists.slony.info > http://lists.slony.info/mailman/listinfo/slony1-general >
- Previous message: [Slony1-general] Slony 2.2 failover changes
- Next message: [Slony1-general] Slony 2.2 failover changes
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list