Christopher Browne cbbrowne at afilias.info
Tue Jan 3 08:33:09 PST 2012
On Tue, Jan 3, 2012 at 5:49 AM, Martijn van Oosterhout
<kleptog at gmail.com> wrote:
> Hoi,
>
> We have a Slony-I setup in a slightly weird situation. What happened
> was that the server did a huge delete (millions of rows) in a single
> transaction which caused the replication to start to run behind. For
> some reason in this state it takes forever to apply the change because
> the query to find out what it needs to apply does a sort or something
> because it doesn't want to apply the whole set at once. A single SYNC
> takes 10 minutes.

If SYNCs before the big one are applying very slowly, then it sounds
like you're hitting bug #167.
<http://www.slony.info/bugzilla/show_bug.cgi?id=167>

Note that we fixed that in version 2.1, so I'll guess you're not on 2.1.

In the case of that huge update, that will indeed be applied in a
single SYNC; once you'd hit that SYNC, it's liable to process for a
good long time (hours?).

> In any case, the way we fixed it before was to unsubscribe and
> resubscribe the set, because resyncing the whole database is quicker
> than waiting for the deletes to complete. However, this time it broke
> in a new way. The result is that slony thinks it is properly
> subscribed, but the database data has not been resynced, so you get
> some bastard combination of old and new data. Logs below.

Unfortunately, the UNSUBSCRIBE request comes in as an event, and it's
later in the event stream than the SYNC-of-the-million-deletes, so
it's probably not processing when you think it ought to.

> Two questions:
> 1. Is there a way we could have detected the unsubscribe failed
> (slonik gave no error, but we didn't ask). If so, we can add that to
> the procedure as something to check

It's quite likely that nothing about the UNSUBSCRIBE *did* fail.  It's
just that it's waiting to process the event until *after* the big clot
of log data in that huge SYNC that you had.

Unfortunately, that's not remotely the result you were hoping for; you
were hoping to cancel the set *before* processing the awful huge SYNC.

I don't think we have any particularly wonderful solution to this.  I
imagined that we had a bug open on having a "Cancel Subscription"
command, which is pretty nearly similar, but I can't find it.  It
would be nice to have a "trample on that subscription - it's not valid
or wanted anymore" command.


More information about the Slony1-general mailing list