Wed Oct 6 19:12:10 PDT 2004
- Previous message: [Slony1-general] initial copy going for 6 days now...
- Next message: [Slony1-general] initial copy going for 6 days now...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 10/6/2004 11:51 AM, Vivek Khera wrote:
> Well, I'm now convinced that something is wrong. There's just no way
> in heck it should take 6 days to replicate my large table across two
> machines, even if they are cheap-o IDE disks... :-)
>
> Pg 7.4.5, Slony 1.0.2, FreeBSD 4.10 on slave, 5.3-BETA on master.
>
> I have logging of every query turned on.
>
> On the master, I see these queries of late:
>
> Oct 6 11:39:58 crash postgres[72808]: [15677-1] LOG: duration: 15.967
> ms statement: select "_mailermailer".createEvent('_mailermailer',
> 'SYNC', NULL);
> Oct 6 11:39:58 crash postgres[72808]: [15678-1] LOG: duration: 31.004
> ms statement: commit transaction;
> Oct 6 11:40:25 crash postgres[71293]: [7862-1] LOG: duration: 36.651
> ms statement: notify "_mailermailer_Event"; notify
> "_mailermailer_Confirm"; insert into
> Oct 6 11:40:25 crash postgres[71293]: [7862-2]
> "_mailermailer".sl_event (ev_origin, ev_seqno, ev_timestamp,
> ev_minxid, ev_maxxid, ev_xip, ev_type ) values ('1',
> Oct 6 11:40:25 crash postgres[71293]: [7862-3] '8211', '2004-10-06
> 11:40:25.085698', '55781', '651614', '''55781''', 'SYNC'); insert into
> "_mailermailer".sl_confirm
> Oct 6 11:40:25 crash postgres[71293]: [7862-4] (con_origin,
> con_received, con_seqno, con_timestamp) values (1, 2, '8211',
> CURRENT_TIMESTAMP); commit transaction;
>
> (yes, the machine is named "crash").
>
> On the slave I'm seeing this:
>
> Oct 6 11:40:32 staging postgres[48728]: [993064-1] LOG: duration:
> 0.653 ms statement: start transaction;set transaction isolation level
> serializable;select last_value from
> Oct 6 11:40:32 staging postgres[48728]: [993064-2]
> "_mailermailer".sl_action_seq;
> Oct 6 11:40:32 staging postgres[48728]: [993065-1] LOG: duration:
> 0.519 ms statement: rollback transaction;
> Oct 6 11:40:33 staging postgres[48728]: [993066-1] LOG: duration:
> 0.657 ms statement: start transaction;set transaction isolation level
> serializable;select last_value from
> Oct 6 11:40:33 staging postgres[48728]: [993066-2]
> "_mailermailer".sl_action_seq;
> Oct 6 11:40:33 staging postgres[48728]: [993067-1] LOG: duration:
> 0.520 ms statement: rollback transaction;
>
>
> this rollback transaction is worrisome to me... it is happening once
> per second. the disk space used is not growing much, either.
>
> The slony log on the slave reads like this for this table. Several
> other tables were successfully copied before:
>
> DEBUG2 remoteWorkerThread_2: copy table public.msg_recipients
> DEBUG2 remoteListenThread_2: queue event 2,622 SYNC
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 563
> DEBUG2 localListenThread: Received event 1,563 SYNC
> DEBUG2 remoteListenThread_2: queue event 2,623 SYNC
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 564
> DEBUG2 localListenThread: Received event 1,564 SYNC
> DEBUG1 cleanupThread: 0.062 seconds for cleanupEvent()
> DEBUG1 cleanupThread: 0.045 seconds for delete logs
> DEBUG2 cleanupThread: 0.449 seconds for vacuuming
> [[ elided ]]
> DEBUG1 cleanupThread: 0.006 seconds for cleanupEvent()
> DEBUG1 cleanupThread: 0.069 seconds for delete logs
> DEBUG2 cleanupThread: 0.465 seconds for vacuuming
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 8212
> DEBUG2 localListenThread: Received event 1,8212 SYNC
> DEBUG2 remoteListenThread_2: queue event 2,8397 SYNC
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 8213
> DEBUG2 localListenThread: Received event 1,8213 SYNC
> DEBUG2 remoteListenThread_2: queue event 2,8398 SYNC
>
> which is the tail of the log right now. The start of the copy above
> happened at 00:02 last friday!
It started copying, but never actually get's that done. Is there
anything holding any lock on that table, either on the master or the slave?
Jan
>
> This table has 165M rows in it: two integers and a varchar(4), a PK
> comprising both integers, and a regular index on the second integer.
>
> Replicating the development copy of this same db worked flawlessly
> (took just a few seconds, since it is tiny). This is a snapshot of my
> production db, which is why it is huge.
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at gborg.postgresql.org
> http://gborg.postgresql.org/mailman/listinfo/slony1-general
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck at Yahoo.com #
- Previous message: [Slony1-general] initial copy going for 6 days now...
- Next message: [Slony1-general] initial copy going for 6 days now...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list