Thu Jun 28 14:04:35 PDT 2012
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jan/Chris this is in reference to the adjust_provider_info issue I was talking to you about the other day. Everyone else: In master/2.2 I've been getting the occasional failure of one of the disorder tests ('merge set', before any set merging takes place). What happens is one of the subscriber nodes (often node 3) will try to pull data in sync_event from a node that is not the provider or origin (often node 5). This node is too far behind so the sync fails. I had attached with gdb and saw that the wd->provider chain contains two nodes, a) The origin node 1 and b) the node that it was not far enough behind. The SYNC event it is processing comes from event_provider=1 The adjust_provider_info function seems to process some event that came from listener 5. It logs the following: just after the subscription to set 1 finishes. 2012-06-28 15:40:34,803 [db3 stdout] DEBUG . - 2012-06-28 15:40:34 EDT CONFIG remoteWorkerThread_1: added active set 1 to provider 1 2012-06-28 15:40:34,803 [db3 stdout] DEBUG . - 2012-06-28 15:40:34 EDT CONFIG remoteWorkerThread_1: added event provider provider 5 The 'added event provider provider 5' is debugging I added to the if block in 'step 4' of adjust_provider info. What I *suspect* is happening is that * Node 3 is not yet subscribed to any sets * remoteListener_5 on node 3 queues up the subscription events and a SYNC event from node 1 * The subscription finishes * adjust_provider_info is called it adds node 1 as a provider since it is the origin of the set. It also adds node 5 as a provider since it is where the event was received from (step 4). * We process the SYNC event from node 1. In sync_event we expect BOTH of those providers in the provider list (1 and 5) to be far enough ahead. Is adjust_provider_info wrong to add a node in step 4 if it has already added a node in step 2? Or is the logic in sync_event wrong? Or should we flushing/rebuilding the provider list at some point (If so when/where?) My guess is that we need to make adjust_provider_info smart enough such that if it receives an event with origin=1 from a provider !=1 that it doesn't add that node to the list as part of step 4 if node 1 is already in the provider list. Thoughts?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-hackers mailing list