Fri Feb 24 07:41:52 PST 2012
- Previous message: [Slony1-general] Slony replication stops right after start Slony replication stops right after start
- Next message: [Slony1-general] Slony replication stops right after start Slony replication stops right after start
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, Sorry, seems I forgot to post versions: Slony: 2.0.7 PostgreSQL: 9.0.5 I restarted the slon deamon with debug=4 on both nodes and this is what I got: Origin: 2012-02-24 16:20:50 CETDEBUG2 localListenThread: Received event 1,5000002765 SYNC 2012-02-24 16:20:50 CETDEBUG2 syncThread: new sl_action_seq 45211 - SYNC 5000002766 2012-02-24 16:20:52 CETDEBUG2 localListenThread: Received event 1,5000002766 SYNC 2012-02-24 16:20:52 CETDEBUG2 syncThread: new sl_action_seq 45233 - SYNC 5000002767 2012-02-24 16:20:54 CETDEBUG2 localListenThread: Received event 1,5000002767 SYNC 2012-02-24 16:20:54 CETDEBUG2 syncThread: new sl_action_seq 45263 - SYNC 5000002768 2012-02-24 16:20:56 CETDEBUG2 localListenThread: Received event 1,5000002768 SYNC 2012-02-24 16:20:56 CETDEBUG2 remoteWorkerThread_2: forward confirm 1,5000002768 received by 2 2012-02-24 16:21:04 CETDEBUG2 syncThread: new sl_action_seq 45263 - SYNC 5000002769 2012-02-24 16:21:06 CETDEBUG2 remoteListenThread_2: queue event 2,5000001298 SYNC 2012-02-24 16:21:06 CETDEBUG2 remoteWorkerThread_2: Received event #2 from 5000001298 type:SYNC 2012-02-24 16:21:06 CETDEBUG1 calc sync size - last time: 1 last length: 18005 ideal: 3 proposed size: 3 2012-02-24 16:21:06 CETDEBUG2 remoteWorkerThread_2: SYNC 5000001298 processing 2012-02-24 16:21:06 CETDEBUG1 remoteWorkerThread_2: no sets need syncing for this event 2012-02-24 16:21:08 CETDEBUG2 localListenThread: Received event 1,5000002769 SYNC 2012-02-24 16:21:08 CETDEBUG2 syncThread: new sl_action_seq 45292 - SYNC 5000002770 2012-02-24 16:21:08 CETDEBUG2 remoteListenThread_2: queue event 2,5000001299 SYNC 2012-02-24 16:21:08 CETDEBUG2 remoteWorkerThread_2: Received event #2 from 5000001299 type:SYNC 2012-02-24 16:21:08 CETDEBUG1 calc sync size - last time: 1 last length: 2003 ideal: 29 proposed size: 3 2012-02-24 16:21:08 CETDEBUG2 remoteWorkerThread_2: SYNC 5000001299 processing 2012-02-24 16:21:08 CETDEBUG1 remoteWorkerThread_2: no sets need syncing for this event 2012-02-24 16:21:10 CETDEBUG2 localListenThread: Received event 1,5000002770 SYNC 2012-02-24 16:21:10 CETDEBUG2 syncThread: new sl_action_seq 45322 - SYNC 5000002771 2012-02-24 16:21:12 CETDEBUG2 localListenThread: Received event 1,5000002771 SYNC 2012-02-24 16:21:12 CETDEBUG2 syncThread: new sl_action_seq 45382 - SYNC 5000002772 2012-02-24 16:21:14 CETDEBUG2 localListenThread: Received event 1,5000002772 SYNC 2012-02-24 16:21:16 CETDEBUG2 remoteWorkerThread_2: forward confirm 1,5000002772 received by 2 2012-02-24 16:21:18 CETDEBUG2 syncThread: new sl_action_seq 45411 - SYNC 5000002773 2012-02-24 16:21:20 CETDEBUG2 localListenThread: Received event 1,5000002773 SYNC 2012-02-24 16:21:26 CETDEBUG2 syncThread: new sl_action_seq 45417 - SYNC 5000002774 2012-02-24 16:21:26 CETDEBUG2 remoteListenThread_2: queue event 2,5000001300 SYNC 2012-02-24 16:21:26 CETDEBUG2 remoteWorkerThread_2: Received event #2 from 5000001300 type:SYNC 2012-02-24 16:21:26 CETDEBUG1 calc sync size - last time: 1 last length: 18012 ideal: 3 proposed size: 3 2012-02-24 16:21:26 CETDEBUG2 remoteWorkerThread_2: SYNC 5000001300 processing 2012-02-24 16:21:26 CETDEBUG1 remoteWorkerThread_2: no sets need syncing for this event 2012-02-24 16:21:26 CETDEBUG2 remoteWorkerThread_2: forward confirm 1,5000002773 received by 2 2012-02-24 16:21:28 CETDEBUG2 syncThread: new sl_action_seq 45430 - SYNC 5000002775 2012-02-24 16:21:28 CETDEBUG2 remoteListenThread_2: queue event 2,5000001301 SYNC 2012-02-24 16:21:28 CETDEBUG2 remoteWorkerThread_2: Received event #2 from 5000001301 type:SYNC 2012-02-24 16:21:28 CETDEBUG1 calc sync size - last time: 1 last length: 2004 ideal: 29 proposed size: 3 2012-02-24 16:21:28 CETDEBUG2 remoteWorkerThread_2: SYNC 5000001301 processing 2012-02-24 16:21:28 CETDEBUG1 remoteWorkerThread_2: no sets need syncing for this event 2012-02-24 16:21:30 CETDEBUG2 syncThread: new sl_action_seq 45460 - SYNC 5000002776 2012-02-24 16:21:30 CETDEBUG2 remoteWorkerThread_2: forward confirm 1,5000002775 received by 2 2012-02-24 16:21:32 CETDEBUG2 localListenThread: Received event 1,5000002774 SYNC 2012-02-24 16:21:32 CETDEBUG2 localListenThread: Received event 1,5000002775 SYNC 2012-02-24 16:21:32 CETDEBUG2 localListenThread: Received event 1,5000002776 SYNC Subscriber: 2012-02-24 16:28:53 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:28:53 CETDEBUG2 remoteWorkerThread_1: current local log_status is 2 2012-02-24 16:28:53 CETDEBUG3 remoteWorkerThread_1: activate helper 1 2012-02-24 16:28:53 CETDEBUG4 remoteWorkerThread_1: waiting for log data 2012-02-24 16:28:53 CETDEBUG4 remoteHelperThread_1_1: got work to do 2012-02-24 16:28:53 CETDEBUG2 remoteWorkerThread_1_1: current remote log_status = 1 2012-02-24 16:28:53 CETDEBUG4 remoteHelperThread_1_1: allocate line buffers 2012-02-24 16:28:53 CETDEBUG4 remoteHelperThread_1_1: fetch from cursor 2012-02-24 16:28:53 CETDEBUG1 remoteHelperThread_1_1: 0.002 seconds delay for first row 2012-02-24 16:28:53 CETDEBUG4 remoteHelperThread_1_1: fetched 0 log rows 2012-02-24 16:28:53 CETDEBUG4 remoteHelperThread_1_1: return 50 unused line buffers 2012-02-24 16:28:53 CETDEBUG1 remoteHelperThread_1_1: 0.003 seconds until close cursor 2012-02-24 16:28:53 CETDEBUG1 remoteHelperThread_1_1: inserts=0 updates=0 deletes=0 2012-02-24 16:28:53 CETDEBUG1 remoteWorkerThread_1: sync_helper timing: pqexec (s/count)- provider 0.002/3 - subscriber 0.000/3 2012-02-24 16:28:53 CETDEBUG1 remoteWorkerThread_1: sync_helper timing: large tuples 0.000/0 2012-02-24 16:28:53 CETDEBUG4 remoteHelperThread_1_1: change helper thread status 2012-02-24 16:28:53 CETDEBUG4 remoteHelperThread_1_1: send DONE/ERROR line to worker 2012-02-24 16:28:53 CETDEBUG3 remoteHelperThread_1_1: waiting for workgroup to finish 2012-02-24 16:28:53 CETDEBUG3 remoteWorkerThread_1: helper 1 finished 2012-02-24 16:28:53 CETDEBUG4 remoteWorkerThread_1: returning lines to pool 2012-02-24 16:28:53 CETDEBUG3 remoteWorkerThread_1: all helpers done. 2012-02-24 16:28:53 CETDEBUG4 remoteWorkerThread_1: changing helper 1 to IDLE 2012-02-24 16:28:53 CETDEBUG2 remoteWorkerThread_1: cleanup 2012-02-24 16:28:53 CETDEBUG4 remoteHelperThread_1_1: waiting for work 2012-02-24 16:28:53 CETINFO remoteWorkerThread_1: SYNC 5000002873 done in 0.014 seconds 2012-02-24 16:28:53 CETDEBUG1 remoteWorkerThread_1: SYNC 5000002873 sync_event timing: pqexec (s/count)- provider 0.001/1 - subsc riber 0.008/1 - IUD 0.001/4 2012-02-24 16:28:55 CETDEBUG2 syncThread: new sl_action_seq 1 - SYNC 5000001345 2012-02-24 16:28:59 CETDEBUG2 localListenThread: Received event 2,5000001345 SYNC 2012-02-24 16:29:05 CETDEBUG2 syncThread: new sl_action_seq 1 - SYNC 5000001346 2012-02-24 16:29:11 CETDEBUG2 localListenThread: Received event 2,5000001346 SYNC 2012-02-24 16:29:11 CETDEBUG2 remoteListenThread_1: queue event 1,5000002874 SYNC 2012-02-24 16:29:11 CETDEBUG2 remoteListenThread_1: queue event 1,5000002875 SYNC 2012-02-24 16:29:11 CETDEBUG2 remoteListenThread_1: queue event 1,5000002876 SYNC 2012-02-24 16:29:11 CETDEBUG2 remoteWorkerThread_1: Received event #1 from 5000002874 type:SYNC 2012-02-24 16:29:11 CETDEBUG1 calc sync size - last time: 1 last length: 18011 ideal: 3 proposed size: 3 2012-02-24 16:29:11 CETDEBUG2 remoteWorkerThread_1: SYNC 5000002876 processing 2012-02-24 16:29:11 CETDEBUG1 about to monitor_subscriber_query - pulling big actionid list for 1 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 1 with 9 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 2 with 15 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 3 with 4 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 4 with 6 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 5 with 3 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 6 with 4 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 7 with 3 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 8 with 23 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: syncing set 9 with 8 table(s) from provider 1 2012-02-24 16:29:11 CETDEBUG4 ssy_action_list value: 2012-02-24 16:29:11 CETDEBUG2 ssy_action_list length: 0 2012-02-24 16:29:11 CETDEBUG2 remoteWorkerThread_1: current local log_status is 2 2012-02-24 16:29:11 CETDEBUG3 remoteWorkerThread_1: activate helper 1 2012-02-24 16:29:11 CETDEBUG4 remoteWorkerThread_1: waiting for log data 2012-02-24 16:29:11 CETDEBUG4 remoteHelperThread_1_1: got work to do 2012-02-24 16:29:11 CETDEBUG2 remoteWorkerThread_1_1: current remote log_status = 1 2012-02-24 16:29:11 CETDEBUG4 remoteHelperThread_1_1: allocate line buffers 2012-02-24 16:29:11 CETDEBUG4 remoteHelperThread_1_1: fetch from cursor 2012-02-24 16:29:11 CETDEBUG1 remoteHelperThread_1_1: 0.002 seconds delay for first row 2012-02-24 16:29:11 CETDEBUG4 remoteHelperThread_1_1: fetched 0 log rows 2012-02-24 16:29:11 CETDEBUG4 remoteHelperThread_1_1: return 50 unused line buffers 2012-02-24 16:29:11 CETDEBUG1 remoteHelperThread_1_1: 0.002 seconds until close cursor 2012-02-24 16:29:11 CETDEBUG1 remoteHelperThread_1_1: inserts=0 updates=0 deletes=0 2012-02-24 16:29:11 CETDEBUG1 remoteWorkerThread_1: sync_helper timing: pqexec (s/count)- provider 0.002/3 - subscriber 0.000/3 2012-02-24 16:29:11 CETDEBUG1 remoteWorkerThread_1: sync_helper timing: large tuples 0.000/0 2012-02-24 16:29:11 CETDEBUG4 remoteHelperThread_1_1: change helper thread status 2012-02-24 16:29:11 CETDEBUG4 remoteHelperThread_1_1: send DONE/ERROR line to worker 2012-02-24 16:29:11 CETDEBUG3 remoteHelperThread_1_1: waiting for workgroup to finish 2012-02-24 16:29:11 CETDEBUG3 remoteWorkerThread_1: helper 1 finished 2012-02-24 16:29:11 CETDEBUG4 remoteWorkerThread_1: returning lines to pool 2012-02-24 16:29:11 CETDEBUG3 remoteWorkerThread_1: all helpers done. 2012-02-24 16:29:11 CETDEBUG4 remoteWorkerThread_1: changing helper 1 to IDLE 2012-02-24 16:29:11 CETDEBUG2 remoteWorkerThread_1: cleanup 2012-02-24 16:29:11 CETDEBUG4 remoteHelperThread_1_1: waiting for work 2012-02-24 16:29:11 CETINFO remoteWorkerThread_1: SYNC 5000002876 done in 0.014 seconds 2012-02-24 16:29:11 CETDEBUG1 remoteWorkerThread_1: SYNC 5000002876 sync_event timing: pqexec (s/count)- provider 0.001/1 - subsc riber 0.008/1 - IUD 0.001/4 2012-02-24 16:29:11 CETDEBUG2 remoteWorkerThread_1: forward confirm 2,5000001346 received by 1 New SYNC events are being generated. I can find several rows in the origin's sl_event table with ev_origin=$originid. None of them however (including rows with ev_origin=$subscribedid) are older than about 15 minutes, even though the most recent data in the subscriber db is several hours old. I'm not really sure how to interpret the data in the pg_locks view, but none of them (neither origin or subscriber) have any rows where the "granted" column isn't set to TRUE, if that means anything? /Ulas On Fri, Feb 24, 2012 at 3:34 PM, Steve Singer <ssinger at ca.afilias.info> wrote: > On 12-02-24 08:21 AM, Ulas Albayrak wrote: > > You didn't say what version of slony you are using with which version of > postgresql. > > I don't see anything in the logs you posted about the slon for the origin > node generating sync events. At DEBUG2 or higher (at least ons some > versions of slony) you should be getting "syncThread: new sl_action_seq %s " > type messages in the log for the slon origin. > > Are new SYNC events being generated in the origin sl_event table with > ev_origin=$originid? > > Many versions of slony require an exclusive lock on sl_event to generate > sync events. Do you have something preventing this? (ie look in pg_locks > to see if the slony sync connection is waiting on a lock). > > > > > >> Hi, >> >> I have been trying to set up a small Slony cluster (only 2 nodes) for >> the last 2 days but I can't get it to work. Everytime I get the same >> result: The replication starts of fine. Slony start copying, trying to >> get all the tables in the subscribing node up to speed. But somewhere >> along the way the 2nd node stops getting updates. Slony replicates all >> the data in a specific table up to a specific point in time and then >> no more. And this time seems to coincide with when the copying of data >> for that specific table started. >> >> An example to illustrate the scenario: >> >> Let's say I have set up the whole replication system and then at 12:00 >> I start the actual replication. Around 12:05 copying of table A from >> node 1 to node 2 starts. It finishes but only the data that was >> received before 12:05 get copied to node 2. Then at 12:10 copying of >> table B starts. Same thing here: Slony copies all the data that was >> received before 12:10 to node 2. And this is the same for all tables. >> >> The logs for the slon deamons show: >> >> Origin node: >> NOTICE: Slony-I: cleanup stale sl_nodelock entry for pid=21942 >> CONTEXT: SQL statement "SELECT "_fleetcluster".cleanupNodelock()" >> PL/pgSQL function "cleanupevent" line 83 at PERFORM >> NOTICE: Slony-I: cleanup stale sl_nodelock entry for pid=21945 >> CONTEXT: SQL statement "SELECT "_fleetcluster".cleanupNodelock()" >> PL/pgSQL function "cleanupevent" line 83 at PERFORM >> NOTICE: Slony-I: Logswitch to sl_log_2 initiated >> CONTEXT: SQL statement "SELECT "_fleetcluster".logswitch_start()" >> PL/pgSQL function "cleanupevent" line 101 at PERFORM >> 2012-02-24 12:17:39 CETINFO cleanupThread: 0.019 seconds for >> cleanupEvent() >> NOTICE: Slony-I: cleanup stale sl_nodelock entry for pid=21949 >> CONTEXT: SQL statement "SELECT "_fleetcluster".cleanupNodelock()" >> PL/pgSQL function "cleanupevent" line 83 at PERFORM >> NOTICE: Slony-I: cleanup stale sl_nodelock entry for pid=23779 >> CONTEXT: SQL statement "SELECT "_fleetcluster".cleanupNodelock()" >> PL/pgSQL function "cleanupevent" line 83 at PERFORM >> >> Subscribing node: >> 2012-02-24 13:20:23 CETINFO remoteWorkerThread_1: SYNC 5000000856 >> done in 0.012 seconds >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 1 with >> 9 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 2 with >> 15 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 3 with >> 4 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 4 with >> 6 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 5 with >> 3 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 6 with >> 4 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 7 with >> 3 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 8 with >> 23 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: syncing set 9 with >> 8 table(s) from provider 1 >> 2012-02-24 13:20:41 CETINFO remoteWorkerThread_1: SYNC 5000000857 >> done in 0.014 seconds >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 1 with >> 9 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 2 with >> 15 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 3 with >> 4 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 4 with >> 6 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 5 with >> 3 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 6 with >> 4 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 7 with >> 3 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 8 with >> 23 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: syncing set 9 with >> 8 table(s) from provider 1 >> 2012-02-24 13:20:43 CETINFO remoteWorkerThread_1: SYNC 5000000858 >> done in 0.011 seconds >> >> >> >> Have anyone experienced this before or have any idea what could be causing >> this? >> > -- Ulas Albayrak ulas.albayrak at gmail.com
- Previous message: [Slony1-general] Slony replication stops right after start Slony replication stops right after start
- Next message: [Slony1-general] Slony replication stops right after start Slony replication stops right after start
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list