Brian Fehrle brianf at consistentstate.com
Thu Sep 27 15:48:30 PDT 2012
On 09/27/2012 03:40 PM, Jan Wieck wrote:
> On 9/27/2012 5:30 PM, Christopher Browne wrote:
>> On Thu, Sep 27, 2012 at 5:26 PM, Jan Wieck <JanWieck at yahoo.com> wrote:
>>> My guess is that the right solution to this is to clean out everything
>>> again when a STORE NODE comes along. We had been thinking of making the
>>> node ID non-reusable to prevent this sort of race conditions.
>>
>> I'm not sure I'm totally comfortable with cleaning it all out
>> instantly; as a step towards that, I'd think it a good idea for slonik
>> to check all the nodes for existence of a node ID, and refuse if it's
>> found anywhere.
>>
>> Under that circumstance, you might need to wait, to run the STORE
>> NODE, until the cleanup thread has run on all the nodes to expunge the
>> last bits of the node on all nodes' databases.
>>
>> Smells a bit safer to me...
>>
>
> Check cleanupEvent(). I think it will never remove that stale event.
>
Yeah, it looks like it will only remove confirmed ones.

--------------code from cleanupEvent()-----------------
     -- ----
     -- Then remove all events that are confirmed by all nodes in the
     -- whole cluster up to the last SYNC
     -- ----
     for v_min_row in select con_origin, min(con_seqno) as con_seqno
                 from sl_confirm
                 group by con_origin
     loop
         select coalesce(max(ev_seqno), 0) into v_max_sync
                 from sl_event
                 where ev_origin = v_min_row.con_origin
                 and ev_seqno <= v_min_row.con_seqno
                 and ev_type = 'SYNC';
         if v_max_sync > 0 then
             delete from sl_event
                     where ev_origin = v_min_row.con_origin
                     and ev_seqno < v_max_sync;
         end if;
     end loop;

the query that hits sl_confirm for the loop returns the following:
  con_origin | con_seqn0
------------+------------
           1 | 5000242178
           2 | 5000661718
           4 | 5000060743

So it never hits node 3 to do any delets from sl_event on node three. 
This is the only place in cleanupEvent i believe will do any deletes 
from sl_event.

So should I try to delete this row myself, or would that cause major 
issues also? I'm still wrapping my head around how sl_confirm and 
sl_event work together when adding/removing nodes.

- Brian F

>
> Jan
>



More information about the Slony1-general mailing list