cbbrowne at ca.afilias.info cbbrowne
Mon Jul 31 05:18:22 PDT 2006
>
>> Supposing we had a single column primary key, it ought to be
>> possible for
>> the slon to recognize:
>>
>>  - I'm issuing "delete from foo where id = x;"
>>
>>  - Hmm.  The next statement is "delete from foo where id = y;"
>>
>>  - That could be folded into the statement:
>>      delete from foo where id in (x, y);
>>
>>  - If we keep deferring, we might discover that the next statement is
>>      delete from foo where id = z, which could turn that into...
>>
>>      delete from foo where id in (x,y,z);
>
> This sounds like a viable way to reduce the number of statements to
> propagate.  You're right, however, in that it is only really clean for
> tables with a single column primary key.  In our tables we tend to have a
> serial column as the primary key and set up unique contraints on groups of
> columns where we don't want duplicates of combinations, so we end up with
> sl_log delete entries containing sequential clauses (recid = 1, recid = 2,
> etc...).
>
>> Alas, I'm not sure I know how to do this if a composite
>> primary key is in
>> use, or, more particularly, if the big series of OR clauses that would
>> result will "play well" the way the "in clause" should.
>>
>> e.g...  Is it any use to submit:
>>
>>   delete from foo where (id1 = x1 and id2 = y1) or (id1 = x2
>> and id2 = y2)
>> or (id1 = x3 and id2 = y3) or ...
>> ???
>
> How are deletes for tables with composite keys written to the sl_log
> tables?  I've only had experience with the single column pk deletes.
> Perhaps someone who uses these a lot has an idea for grouping them...?

The primary key is stored in much that form...

Actually, more like...
   id1 = 'x1' and id2 = 'y1'

My only concern here is whether or not a multiple clause delete statement
would use the PK index, or whether it loses that.  If it loses the ability
to use the index, this becomes a source of Very Bad Performance :-(.  I
think it'll likely work OK, though...

>> This would be a fairly significant restructuring of the
>> update submission
>> loop, as this needs to defer all deletes.  Are you keen on prototyping
>> that?
>>
> The problem here is that I don't think all deletes can be deferred.  This
> logic would only apply until the first non-delete statement is
> encountered.  Take the situation where a delete removes a record, and then
> a record is inserted that uses the same pk.

The logic could only apply as long as the sequence of statements were all
deletes on a single table.

I don't particularly see a problem there; the case that's inefficient now
is where the original delete was something like:

  delete from tab1 where [some condition covering 20,000 tuples];

Supposing there is a lot of other replicable "traffic" on the system, what
I'd expect to see in the sequence of activity is "spurts" of deletions on
tab1.  They may be mixed with other activity, but there should often be a
dozen deletes on tab1 grouped together.  And that's a case where some
performance "win" could be gotten...




More information about the Slony1-general mailing list