Steve Singer ssinger at ca.afilias.info
Wed May 11 20:38:55 PDT 2011
On 11-05-11 06:23 PM, Richard Yen wrote:
> Steve, I've also noticed something even more bizarre.
>
> Fortunately, I have log_shipping set up on a third subscriber, and
> grepping through the files, I find the following:
>
> [root at asp-dbr1 log_staging]# for i in `ls`; do grep -Hn cron_lock $i; done
> slony1_log_3_00000000000011179582.sql:13:insert into
> "public"."cron_lock" ("id","lock_until_time") values
> ('anonymous_marking','2011-05-11 11:40:02.456091');


> It seems that something has gone horribly wrong with replication such
> that it's beginning to even replay events.
>
> Does this look like an issue you may already be aware of?

I wonder if this is caused by
1. Slon (on the slave) writes the SQL as part of the sync to the 
logshipper file
2. slon tries to apply the SQL to the slave
3. The transaction roles back on the slave but doesn't delete anything 
from the logshipping file
4. Slon retires (goto step 1).

I'll try to look at the log shipping code in slon to see if something 
similar to the above is in fact the case.



> --Richard
>
>
>
> On Wed, May 11, 2011 at 2:43 PM, Richard Yen <dba at richyen.com
> <mailto:dba at richyen.com>> wrote:
>
>
>
>     On Wed, May 11, 2011 at 2:26 PM, Steve Singer
>     <ssinger at ca.afilias.info <mailto:ssinger at ca.afilias.info>> wrote:
>
>         On 11-05-11 04:29 PM, Richard Yen wrote:
>
>             Thanks Steve.  I've put the dumps for the master and my two
>             slaves at
>             http://richyen.com/slony/
>
>
>         As Jan suspected there isn't really anything still left in the
>         logs about events from before the failed insert.
>
>         When should the row have been deleted from cron_lock?
>         immediately before or sometime before?
>
>     The cron runs every hour, so there should have been a DELETE  at
>     10:40AM, just milliseconds before the INSERT.
>
>         Did anything else happen around this time? (server restarts,
>         moving masters etc?)
>
>     No, we didn't do anything during this time.  The previous action I
>     performed on this cluster was at 10:04AM, which was to INSERT a row
>     into another table in the database.  The sync_interval I have is
>     500ms, so I'm fairly confident that this would have been processed
>     long before the offending INSERT to cron_lock arrived at 10:40AM.
>       The oldest entry in sl_even right now is '2011-05-11 10:40:02.354653'
>
>     If we can't get any more clues, I'll go ahead an process the DELETE.
>       Thanks again for your help in trying to get to the bottom of this.
>     --Richard
>
>



More information about the Slony1-general mailing list