[Slony1-general] Slony-I: log switch to sl_log_2 still in progress

Thu Jan 7 14:47:04 PST 2016

On Thu, 7 Jan 2016, Tory M Blue wrote:

> 
> So I'm backing up in a big way. I know what started it, "adding a new insert slave which took 13 hours to complete (indexes
> etc)".. But now it doesn't appear I am able to catch up. I see the slave doing what it's suppose to, get a bunch of data,
> truncate the sl_log files move on. But the master is having a hard time.
> 
> Postgres 9.4.5 and Slony 2.2.3
> 
> All other nodes don't have any errors or issues.
> 
> this is Node 1 (the master)
> node 2 is a slave
> node 3-5 are query slaves with only 1 of 3 sets being replicated too.
> 
> I have interval at 5 minutes and sync_group_maxsize=50
> 
> Any suggestions on where to thump it. At some point this will cause issues on my master and when I see that starting, I'll
> have to drop node 2 again, and when i add it, it will take 13+ hours and I'll be back in the same position :)

Bump sync_group_maxsize to be much bigger, I'm not saying that will solve 
the problem but it might help(max allowed is 10,000). I'm also suspect when 
you say your have a sync_interval of 5 minutes, since I thought 60 seconds was the largest 
allowed.

> 
> Thanks
> Tory
> 
> 
> 
> Node:  Old Transactions Kept Open
> ================================================
> Old Transaction still running with age 01:48:00 > 01:30:00
> 
> Query: autovacuum: VACUUM
> 
> 
> Node: 0 threads seem stuck
> ================================================
> Slony-I components have not reported into sl_components in interval 00:05:00
> 
> Perhaps slon is not running properly?
> 
> Query:
>      select co_actor, co_pid, co_node, co_connection_pid, co_activity, co_starttime, now() - co_starttime, co_event,
> co_eventtype
>      from "_admissioncls".sl_components
>      where  (now() - co_starttime) > '00:05:00'::interval
>      order by co_starttime;
>   
> 
> 
> Node: 1 sl_log_1 tuples = 219700 > 200000
> ================================================
> Number of tuples in Slony-I table sl_log_1 is 219700 which
> exceeds 200000.
> 
> You may wish to investigate whether or not a node is down, or perhaps
> if sl_confirm entries have not been propagating properly.
> 
> 
> Node: 1 sl_log_2 tuples = 1.74558e+07 > 200000
> ================================================
> Number of tuples in Slony-I table sl_log_2 is 1.74558e+07 which
> exceeds 200000.
> 
> You may wish to investigate whether or not a node is down, or perhaps
> if sl_confirm entries have not been propagating properly.
> 
> 
> Node: 2 sl_log_2 tuples = 440152 > 200000
> ================================================
> Number of tuples in Slony-I table sl_log_2 is 440152 which
> exceeds 200000.
> 
> You may wish to investigate whether or not a node is down, or perhaps
> if sl_confirm entries have not been propagating properly.
> 
> 
>