[Slony1-general] finishTableAfterCopy(143); long running "index?"

Wed Oct 29 13:24:51 PDT 2014

On Wed, Oct 29, 2014 at 8:15 AM, Tory M Blue <tmblue at gmail.com> wrote:

>
>
> On Wed, Oct 29, 2014 at 8:07 AM, Steve Singer <ssinger at ca.afilias.info>
> wrote:
>
>> On 10/29/2014 10:21 AM, Tory M Blue wrote:
>>
>>>
>>>
>>> On Wed, Oct 29, 2014 at 5:50 AM, Tory M Blue <tmblue at gmail.com
>>> <mailto:tmblue at gmail.com>> wrote:
>>>
>>>
>>>
>>>
>>> Well definitely still doing something, just wish it was enough for my
>>> system to notice.
>>>
>>
>>
>> Did you change maintenance workmem before the index rebuild started?
>> Also, how many indexes are on the table just 1 (the primary key) index?
>> If there is more than 1 you might want to drop the others before you
>> subscribe the table and them concurrently later.
>>
>>
> I did not it was set at 2GB prior to it starting. What is weird is that
> pgstat doesn't show an index creation
>
>  16398 | clsdb | 54600 |       10 | postgres | slon.remoteWorkerThread_1 |
> 10.13.200.232 |                 |       54260 | 2014-10-28
> 22:19:29.277022-07 | 2014-10-29 00:05:40.884649-07 | 2014-10-29
> 01:17:22.102619-07 | 2014-10-29
>  01:17:22.10262-07  | f       | active | select
> "_cls".finishTableAfterCopy(143); analyze "torque"."impressions";
>
> So not quite sure what it's been doing since well 1:17am PST.. and it's
> still sitting there..  Not seeing any growth on disk the single process is
> still sitting at 97-100% cpu.
>
> Ya starting to feel like doing index post is a good idea and something I
> looked into earlier this year/last year. But at this point if I restart
> slon, this process dies with it and it will attempt to resubscribe this set
> so I will have lost 7 hours and it will have to move the 52GB of data over
> the wire again and then do  "whatever the above command is doing".. Not in
> dire straights yet 6million rows backed up, I think if I hit 8M I'll have
> to abort and try again.
>
> I have 3 sets, set 1 finished cleanly, set 3 has not obviously been
> attempted.  I'm wondering if I can do something with 2 and let it complete
> 3. Well anyways ..bahh!!: ) I wish restarting slon would not abort the work
> thus far..
> \
>

Thanks for the replies Steve  and Christopher.

So it's just a big table, with lots of data and 5 indexes, not including
the PK index..   We went ahead and stopped slon (killing any progress we
made on set 2, but at least retained set1), we dropped the indexes form the
big tables and started slon again. this time set 2 completed as did set 3
(very tiny set).. I left the indexes off just to give us a bit more room
with digging our way out of the backup that we had,  almost 8million rows ,
6.5m in sl_log1 and 1.5m in sl_log2. I bumped "sync_group_maxsize" to 100
and think I was able to chew through the 8million rows in maybe 30 minutes
, give or take.

Once my concern over my masters health subsided, I was able to again stop
slon on the slave and  am currently creating 5 indexes simultaneous
(actually 4, the 5 was being blocked by one of the 4). It's nice to see my
server actually show some signs of life. I have so much compute and
Postgres can't use it, sometimes it's frustrating.

So thanks for the pointers, suggestions and overall validation that things
were as they should  be and I should have actually slept vs watched this
all night/morning.

But I get to do it again this evening, but this one will require a
switchover, in order for me to upgrade my master. I then have a large
fundamental slon schema change, that will require me to do a ground up
again "this is when large data sets and Postgres" get a bit heated. However
now I know that removing the indexes will help things out tremendously and
at least get the cluster back up in a timely fashion (just shy of 2 hours
vs 12_hours and counting.

So thanks again!
Tory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20141029/3c1d479d/attachment.htm