[Slony1-general] initial copy going for 6 days now...

Wed Oct 6 16:51:32 PDT 2004

Well, I'm now convinced that something is wrong.  There's just no way 
in heck it should take 6 days to replicate my large table across two 
machines, even if they are cheap-o IDE disks... :-)

Pg 7.4.5, Slony 1.0.2, FreeBSD 4.10 on slave, 5.3-BETA on master.

I have logging of every query turned on.

On the master, I see these queries of late:

Oct  6 11:39:58 crash postgres[72808]: [15677-1] LOG:  duration: 15.967 
ms  statement: select "_mailermailer".createEvent('_mailermailer', 
'SYNC', NULL);
Oct  6 11:39:58 crash postgres[72808]: [15678-1] LOG:  duration: 31.004 
ms  statement: commit transaction;
Oct  6 11:40:25 crash postgres[71293]: [7862-1] LOG:  duration: 36.651 
ms  statement: notify "_mailermailer_Event"; notify 
"_mailermailer_Confirm"; insert into
Oct  6 11:40:25 crash postgres[71293]: [7862-2]  
"_mailermailer".sl_event     (ev_origin, ev_seqno, ev_timestamp,      
ev_minxid, ev_maxxid, ev_xip, ev_type     ) values ('1',
Oct  6 11:40:25 crash postgres[71293]: [7862-3]  '8211', '2004-10-06 
11:40:25.085698', '55781', '651614', '''55781''', 'SYNC'); insert into 
"_mailermailer".sl_confirm
Oct  6 11:40:25 crash postgres[71293]: [7862-4]         (con_origin, 
con_received, con_seqno, con_timestamp)    values (1, 2, '8211', 
CURRENT_TIMESTAMP); commit transaction;

(yes, the machine is named "crash").

On the slave I'm seeing this:

Oct  6 11:40:32 staging postgres[48728]: [993064-1] LOG:  duration: 
0.653 ms  statement: start transaction;set transaction isolation level 
serializable;select last_value from
Oct  6 11:40:32 staging postgres[48728]: [993064-2]  
"_mailermailer".sl_action_seq;
Oct  6 11:40:32 staging postgres[48728]: [993065-1] LOG:  duration: 
0.519 ms  statement: rollback transaction;
Oct  6 11:40:33 staging postgres[48728]: [993066-1] LOG:  duration: 
0.657 ms  statement: start transaction;set transaction isolation level 
serializable;select last_value from
Oct  6 11:40:33 staging postgres[48728]: [993066-2]  
"_mailermailer".sl_action_seq;
Oct  6 11:40:33 staging postgres[48728]: [993067-1] LOG:  duration: 
0.520 ms  statement: rollback transaction;

this rollback transaction is worrisome to me...  it is happening once 
per second.  the disk space used is not growing much, either.

The slony log on the slave reads like this for this table.  Several 
other tables were successfully copied before:

DEBUG2 remoteWorkerThread_2: copy table public.msg_recipients
DEBUG2 remoteListenThread_2: queue event 2,622 SYNC
DEBUG2 syncThread: new sl_action_seq 1 - SYNC 563
DEBUG2 localListenThread: Received event 1,563 SYNC
DEBUG2 remoteListenThread_2: queue event 2,623 SYNC
DEBUG2 syncThread: new sl_action_seq 1 - SYNC 564
DEBUG2 localListenThread: Received event 1,564 SYNC
DEBUG1 cleanupThread:    0.062 seconds for cleanupEvent()
DEBUG1 cleanupThread:    0.045 seconds for delete logs
DEBUG2 cleanupThread:    0.449 seconds for vacuuming
  [[ elided ]]
DEBUG1 cleanupThread:    0.006 seconds for cleanupEvent()
DEBUG1 cleanupThread:    0.069 seconds for delete logs
DEBUG2 cleanupThread:    0.465 seconds for vacuuming
DEBUG2 syncThread: new sl_action_seq 1 - SYNC 8212
DEBUG2 localListenThread: Received event 1,8212 SYNC
DEBUG2 remoteListenThread_2: queue event 2,8397 SYNC
DEBUG2 syncThread: new sl_action_seq 1 - SYNC 8213
DEBUG2 localListenThread: Received event 1,8213 SYNC
DEBUG2 remoteListenThread_2: queue event 2,8398 SYNC

which is the tail of the log right now.  The start of the copy above 
happened at 00:02 last friday!

This table has 165M rows in it: two integers and a varchar(4), a PK 
comprising both integers, and a regular index on the second integer.

Replicating the development copy of this same db worked flawlessly 
(took just a few seconds, since it is tiny).  This is a snapshot of my 
production db, which is why it is huge.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2476 bytes
Desc: not available
Url : http://gborg.postgresql.org/pipermail/slony1-general/attachments/20041006/eaa5944f/smime-0001.bin