Wed Oct 22 15:31:53 PDT 2014
- Previous message: [Slony1-bugs] [Bug 353] New: Health check - checkpoint configuration
- Next message: [Slony1-bugs] [Bug 354] failover can get stuck waiting for events after sl_listen is empty
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
http://www.slony.info/bugzilla/show_bug.cgi?id=354 Summary: failover can get stuck waiting for events after sl_listen is empty Product: Slony-I Version: devel Platform: PC OS/Version: Linux Status: NEW Severity: enhancement Priority: low Component: stored procedures AssignedTo: slony1-bugs at lists.slony.info ReportedBy: ssinger at ca.afilias.info CC: slony1-bugs at lists.slony.info Estimated Hours: 0.0 This applies to the 2.2 branch and master Consider a cluster where node 1 is the origin of all sets 1-->2 | 3->5 | 4 FAILOVER(id=5,backup node=5); Where we are failing over a non-origin. The remote worker for a node like node 4 might receiver an event from node 1 FAILOVER_NODE, 5, 5000001234) That says the node 5 has failed and the last event from node 5 that node 1 has seen is 5000001234 Now if node 4 gets the failover over before 5,5000001234 then the the remoteWorker_1 will call the failedNode() stored function on node 4. Then it will wait for 5,5000001234. However if slon then restarts the slon will never process 5,5000001234 (a restart might not even be required but that is how I've seen this) The failedNode() call erased all the sl_listen entires with li_origin=5 select * FROM _disorder_replica.sl_listen where li_origin=5; li_origin | li_provider | li_receiver -----------+-------------+------------- (0 rows) None of the remoteListener threads will be listening for events from node 5 so the misisng SYNC will never get processed. The SYNC probably contains no data because 5 is not an origin but the failover event is stuck. -- Configure bugmail: http://www.slony.info/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. You are the assignee for the bug.
- Previous message: [Slony1-bugs] [Bug 353] New: Health check - checkpoint configuration
- Next message: [Slony1-bugs] [Bug 354] failover can get stuck waiting for events after sl_listen is empty
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-bugs mailing list