Sat Apr 30 17:33:17 PDT 2005
- Previous message: [Slony1-general] 1.1 Beta
- Next message: [Slony1-general] transaction still in progress
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Running Slony 1.1 beta with postgresql-7.4.7. Simple master slave configuration, hosts k and j, using IDs 1 and 2, aliased as @knode and @jnode. Replicating 5 tables nicely. This script (I've inlined an include) attempts a switchover of set 1 from k to j. -- define cluster T1; define knode 1; define jnode 2; define korigin origin = @knode; define jorigin origin = @jnode; define kid id = @knode; define jid id = @jnode; define setOpmet id = 1; cluster name = @cluster; node @knode admin conninfo = 'dbname=avbrief host=k port=5432 user=slony password='; node @jnode admin conninfo = 'dbname=avbrief host=j port=5432 user=slony password='; lock set (@setOpmet, @korigin); wait for event (@korigin, confirmed = @jnode); move set (@setOpmet, old origin = @knode, new origin = @jnode); wait for event (@korigin, confirmed = @jnode); -- The script never returns. On k (the original origin) I get: 2005-04-30 16:09:40 UTC DEBUG2 syncThread: new sl_action_seq 664 - SYNC 48 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_2: forward confirm 1,48 received by 2 2005-04-30 16:09:41 UTC DEBUG2 localListenThread: Received event 1,48 SYNC 2005-04-30 16:09:41 UTC DEBUG2 localListenThread: Received event 1,49 SYNC 2005-04-30 16:09:41 UTC DEBUG2 localListenThread: Received event 1,50 MOVE_SET 2005-04-30 16:09:41 UTC CONFIG moveSet: set_id=1 old_origin=1 new_origin=2 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=2 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_2: update provider configuration 2005-04-30 16:09:41 UTC DEBUG1 remoteWorkerThread_2: helper thread for provider 2 created 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_2: added active set 1 to provider 2 2005-04-30 16:09:41 UTC CONFIG storeListen: li_origin=2 li_receiver=1 li_provider=2 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=2 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=2 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_2: update provider configuration 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_2: added active set 1 to provider 2 2005-04-30 16:09:41 UTC DEBUG4 remoteHelperThread_2_2: waiting for work 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_2: forward confirm 1,49 received by 2 2005-04-30 16:09:41 UTC DEBUG2 remoteListenThread_2: queue event 2,33 ACCEPT_SET 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_2: Received event 2,33 ACCEPT_SET 2005-04-30 16:09:41 UTC DEBUG1 slon: done 2005-04-30 16:09:41 UTC DEBUG2 slon: exit(0) On j (the new origin) I get: 2005-04-30 16:09:40 UTC DEBUG2 remoteListenThread_1: queue event 1,48 SYNC 2005-04-30 16:09:40 UTC DEBUG2 remoteWorkerThread_1: Received event 1,48 SYNC 2005-04-30 16:09:40 UTC DEBUG3 calc sync size - last time: 1 last length: 10083 ideal: 5 proposed size: 2 2005-04-30 16:09:40 UTC DEBUG2 remoteWorkerThread_1: SYNC 48 processing 2005-04-30 16:09:40 UTC DEBUG2 remoteWorkerThread_1: syncing set 1 with 5 table(s) from provider 1 2005-04-30 16:09:40 UTC DEBUG3 remoteWorkerThread_1: activate helper 1 2005-04-30 16:09:40 UTC DEBUG4 remoteWorkerThread_1: waiting for log data 2005-04-30 16:09:40 UTC DEBUG4 remoteHelperThread_1_1: allocate lines 2005-04-30 16:09:40 UTC DEBUG4 remoteHelperThread_1_1: have 10 line buffers 2005-04-30 16:09:40 UTC DEBUG2 remoteHelperThread_1_1: 0.003 seconds delay for first row 2005-04-30 16:09:40 UTC DEBUG3 remoteHelperThread_1_1: got 0 log rows 2005-04-30 16:09:40 UTC DEBUG3 remoteHelperThread_1_1: 0 log buffers delivered 2005-04-30 16:09:40 UTC DEBUG4 remoteHelperThread_1_1: no more log rows 2005-04-30 16:09:40 UTC DEBUG2 remoteHelperThread_1_1: 0.005 seconds until close cursor 2005-04-30 16:09:40 UTC DEBUG4 remoteHelperThread_1_1: change helper thread status 2005-04-30 16:09:40 UTC DEBUG4 remoteHelperThread_1_1: send DONE/ERROR line to worker 2005-04-30 16:09:40 UTC DEBUG3 remoteHelperThread_1_1: waiting for workgroup to finish 2005-04-30 16:09:40 UTC DEBUG3 remoteWorkerThread_1: helper 1 finished 2005-04-30 16:09:40 UTC DEBUG4 remoteWorkerThread_1: returning lines to pool 2005-04-30 16:09:40 UTC DEBUG3 remoteWorkerThread_1: all helpers done. 2005-04-30 16:09:40 UTC DEBUG4 remoteWorkerThread_1: changing helper 1 to IDLE 2005-04-30 16:09:40 UTC DEBUG4 remoteWorkerThread_1: cleanup 2005-04-30 16:09:40 UTC DEBUG4 remoteHelperThread_1_1: waiting for work 2005-04-30 16:09:40 UTC DEBUG2 remoteWorkerThread_1: new sl_rowid_seq value: 1000000000000000 2005-04-30 16:09:40 UTC DEBUG2 remoteWorkerThread_1: SYNC 48 done in 0.019 seconds 2005-04-30 16:09:41 UTC DEBUG2 remoteListenThread_1: queue event 1,49 SYNC 2005-04-30 16:09:41 UTC DEBUG2 remoteListenThread_1: queue event 1,50 MOVE_SET 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_1: Received event 1,49 SYNC 2005-04-30 16:09:41 UTC DEBUG3 calc sync size - last time: 1 last length: 133 ideal: 451 proposed size: 2 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_1: SYNC 49 processing 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_1: syncing set 1 with 5 table(s) from provider 1 2005-04-30 16:09:41 UTC DEBUG3 remoteWorkerThread_1: activate helper 1 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_1: waiting for log data 2005-04-30 16:09:41 UTC DEBUG4 remoteHelperThread_1_1: allocate lines 2005-04-30 16:09:41 UTC DEBUG4 remoteHelperThread_1_1: have 10 line buffers 2005-04-30 16:09:41 UTC DEBUG2 remoteHelperThread_1_1: 0.005 seconds delay for first row 2005-04-30 16:09:41 UTC DEBUG3 remoteHelperThread_1_1: got 0 log rows 2005-04-30 16:09:41 UTC DEBUG3 remoteHelperThread_1_1: 0 log buffers delivered 2005-04-30 16:09:41 UTC DEBUG4 remoteHelperThread_1_1: no more log rows 2005-04-30 16:09:41 UTC DEBUG2 remoteHelperThread_1_1: 0.007 seconds until close cursor 2005-04-30 16:09:41 UTC DEBUG4 remoteHelperThread_1_1: change helper thread status 2005-04-30 16:09:41 UTC DEBUG4 remoteHelperThread_1_1: send DONE/ERROR line to worker 2005-04-30 16:09:41 UTC DEBUG3 remoteHelperThread_1_1: waiting for workgroup to finish 2005-04-30 16:09:41 UTC DEBUG3 remoteWorkerThread_1: helper 1 finished 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_1: returning lines to pool 2005-04-30 16:09:41 UTC DEBUG3 remoteWorkerThread_1: all helpers done. 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_1: changing helper 1 to IDLE 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_1: cleanup 2005-04-30 16:09:41 UTC DEBUG4 remoteHelperThread_1_1: waiting for work 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_1: new sl_rowid_seq value: 1000000000000000 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_1: SYNC 49 done in 0.022 seconds 2005-04-30 16:09:41 UTC DEBUG2 remoteWorkerThread_1: Received event 1,50 MOVE_SET 2005-04-30 16:09:41 UTC CONFIG moveSet: set_id=1 old_origin=1 new_origin=2 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=2 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC CONFIG storeListen: li_origin=1 li_receiver=2 li_provider=1 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_1: update provider configuration 2005-04-30 16:09:41 UTC DEBUG1 remoteWorkerThread_1: helper thread for provider 1 terminated 2005-04-30 16:09:41 UTC DEBUG1 remoteWorkerThread_1: disconnecting from data provider 1 2005-04-30 16:09:41 UTC DEBUG2 localListenThread: Received event 2,33 ACCEPT_SET 2005-04-30 16:09:41 UTC FATAL localListenThread: event 33: Unknown event type ACCEPT_SET 2005-04-30 16:09:41 UTC DEBUG1 slon: shutdown requested 2005-04-30 16:09:41 UTC DEBUG2 slon: notify worker process to shutdown 2005-04-30 16:09:41 UTC INFO remoteListenThread_1: disconnecting from 'dbname=avbrief host=k port=5432 user=slony password=' 2005-04-30 16:09:41 UTC DEBUG1 remoteListenThread_1: thread done 2005-04-30 16:09:41 UTC DEBUG2 slon: wait for worker process to shutdown 2005-04-30 16:09:41 UTC DEBUG1 syncThread: thread done 2005-04-30 16:09:41 UTC DEBUG1 cleanupThread: thread done 2005-04-30 16:09:41 UTC DEBUG1 main: scheduler mainloop returned 2005-04-30 16:09:41 UTC DEBUG2 main: wait for remote threads 2005-04-30 16:09:41 UTC DEBUG4 remoteWorkerThread_1: update provider configuration 2005-04-30 16:09:41 UTC DEBUG1 remoteWorkerThread_1: thread done 2005-04-30 16:09:41 UTC DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + worker signaled) 2005-04-30 16:09:41 UTC DEBUG2 main: notify parent that worker is done 2005-04-30 16:09:41 UTC DEBUG1 main: done 2005-04-30 16:09:41 UTC DEBUG2 slon: worker process shutdown ok 2005-04-30 16:09:41 UTC DEBUG2 slon: exit(-1) Attempting to restart slon on k fails (repeatedly) 2005-04-30 16:20:48 UTC CONFIG main: slon version 1.1.0 starting up 2005-04-30 16:20:48 UTC CONFIG main: local node id = 1 2005-04-30 16:20:48 UTC DEBUG2 main: main process started 2005-04-30 16:20:48 UTC DEBUG2 slon: watchdog process started 2005-04-30 16:20:48 UTC DEBUG2 slon: begin signal handler setup 2005-04-30 16:20:48 UTC DEBUG2 slon: end signal handler setup 2005-04-30 16:20:48 UTC DEBUG2 slon: wait for main child process 2005-04-30 16:20:48 UTC DEBUG2 main: begin signal handler setup 2005-04-30 16:20:48 UTC DEBUG2 main: end signal handler setup 2005-04-30 16:20:48 UTC CONFIG main: launching sched_start_mainloop 2005-04-30 16:20:48 UTC CONFIG main: loading current cluster configuration 2005-04-30 16:20:48 UTC CONFIG storeNode: no_id=2 no_comment='Node 2' 2005-04-30 16:20:48 UTC DEBUG2 setNodeLastEvent: no_id=2 event_seq=32 2005-04-30 16:20:48 UTC CONFIG storePath: pa_server=2 pa_client=1 pa_conninfo="dbname=avbrief host=j port=5432 user=slony password=" pa_connretry=10 2005-04-30 16:20:48 UTC CONFIG storeListen: li_origin=2 li_receiver=1 li_provider=2 2005-04-30 16:20:48 UTC CONFIG storeSet: set_id=1 set_origin=2 set_comment='avbrief tables' 2005-04-30 16:20:48 UTC WARN remoteWorker_wakeup: node 2 - no worker thread 2005-04-30 16:20:48 UTC DEBUG2 sched_wakeup_node(): no_id=2 (0 threads + worker signaled) 2005-04-30 16:20:48 UTC CONFIG storeSubscribe: sub_set=1 sub_provider=2 sub_forward='t' 2005-04-30 16:20:48 UTC WARN remoteWorker_wakeup: node 2 - no worker thread 2005-04-30 16:20:48 UTC DEBUG2 sched_wakeup_node(): no_id=2 (0 threads + worker signaled) 2005-04-30 16:20:48 UTC CONFIG enableSubscription: sub_set=1 2005-04-30 16:20:48 UTC WARN remoteWorker_wakeup: node 2 - no worker thread 2005-04-30 16:20:48 UTC DEBUG2 sched_wakeup_node(): no_id=2 (0 threads + worker signaled) 2005-04-30 16:20:48 UTC DEBUG2 main: last local event sequence = 50 2005-04-30 16:20:48 UTC CONFIG main: configuration complete - starting threads 2005-04-30 16:20:48 UTC DEBUG1 localListenThread: thread starts 2005-04-30 16:20:48 UTC CONFIG enableNode: no_id=2 2005-04-30 16:20:48 UTC DEBUG1 remoteWorkerThread_2: thread starts 2005-04-30 16:20:48 UTC DEBUG1 remoteListenThread_2: thread starts 2005-04-30 16:20:48 UTC DEBUG1 cleanupThread: thread starts 2005-04-30 16:20:48 UTC DEBUG4 cleanupThread: bias = 35383 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: start listening for event origin 2 2005-04-30 16:20:48 UTC DEBUG1 syncThread: thread starts 2005-04-30 16:20:48 UTC DEBUG1 main: running scheduler mainloop 2005-04-30 16:20:48 UTC DEBUG4 remoteWorkerThread_2: update provider configuration 2005-04-30 16:20:48 UTC DEBUG1 remoteWorkerThread_2: helper thread for provider 2 created 2005-04-30 16:20:48 UTC DEBUG4 remoteWorkerThread_2: added active set 1 to provider 2 2005-04-30 16:20:48 UTC DEBUG4 remoteHelperThread_2_2: waiting for work 2005-04-30 16:20:48 UTC DEBUG1 remoteListenThread_2: connected to 'dbname=avbrief host=j port=5432 user=slony password=' 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,33 ACCEPT_SET 2005-04-30 16:20:48 UTC DEBUG2 remoteWorkerThread_2: Received event 2,33 ACCEPT_SET 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,34 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,35 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,36 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,37 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,38 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,39 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,40 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,41 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,42 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,43 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,44 SYNC 2005-04-30 16:20:48 UTC DEBUG2 remoteListenThread_2: queue event 2,45 SYNC 2005-04-30 16:20:48 UTC DEBUG1 slon: done 2005-04-30 16:20:48 UTC DEBUG2 slon: exit(0) Attempting to restart slon on j succeeds 2005-04-30 16:19:36 UTC CONFIG main: slon version 1.1.0 starting up 2005-04-30 16:19:36 UTC CONFIG main: local node id = 2 2005-04-30 16:19:36 UTC DEBUG2 main: main process started 2005-04-30 16:19:36 UTC DEBUG2 slon: watchdog process started 2005-04-30 16:19:36 UTC DEBUG2 slon: begin signal handler setup 2005-04-30 16:19:36 UTC DEBUG2 slon: end signal handler setup 2005-04-30 16:19:36 UTC DEBUG2 slon: wait for main child process 2005-04-30 16:19:36 UTC DEBUG2 main: begin signal handler setup 2005-04-30 16:19:36 UTC DEBUG2 main: end signal handler setup 2005-04-30 16:19:36 UTC CONFIG main: launching sched_start_mainloop 2005-04-30 16:19:36 UTC CONFIG main: loading current cluster configuration 2005-04-30 16:19:36 UTC CONFIG storeNode: no_id=1 no_comment='Node 1' 2005-04-30 16:19:36 UTC DEBUG2 setNodeLastEvent: no_id=1 event_seq=50 2005-04-30 16:19:36 UTC CONFIG storePath: pa_server=1 pa_client=2 pa_conninfo="dbname=avbrief host=k port=5432 user=slony password=" pa_connretry=10 2005-04-30 16:19:36 UTC CONFIG storeListen: li_origin=1 li_receiver=2 li_provider=1 2005-04-30 16:19:36 UTC CONFIG storeSet: set_id=1 set_origin=2 set_comment='avbrief tables' 2005-04-30 16:19:36 UTC DEBUG2 sched_wakeup_node(): no_id=2 (0 threads + worker signaled) 2005-04-30 16:19:36 UTC DEBUG2 main: last local event sequence = 33 2005-04-30 16:19:36 UTC CONFIG main: configuration complete - starting threads 2005-04-30 16:19:36 UTC DEBUG1 localListenThread: thread starts 2005-04-30 16:19:36 UTC CONFIG enableNode: no_id=1 2005-04-30 16:19:36 UTC DEBUG1 remoteWorkerThread_1: thread starts 2005-04-30 16:19:36 UTC DEBUG1 remoteListenThread_1: thread starts 2005-04-30 16:19:36 UTC DEBUG2 remoteListenThread_1: start listening for event origin 1 2005-04-30 16:19:36 UTC DEBUG1 cleanupThread: thread starts 2005-04-30 16:19:36 UTC DEBUG4 cleanupThread: bias = 35383 2005-04-30 16:19:36 UTC DEBUG1 syncThread: thread starts 2005-04-30 16:19:36 UTC DEBUG1 main: running scheduler mainloop 2005-04-30 16:19:36 UTC DEBUG4 remoteWorkerThread_1: update provider configuration 2005-04-30 16:19:36 UTC DEBUG1 remoteListenThread_1: connected to 'dbname=avbrief host=k port=5432 user=slony password=' 2005-04-30 16:19:36 UTC DEBUG2 remoteWorkerThread_1: forward confirm 2,32 received by 1 2005-04-30 16:19:38 UTC DEBUG2 syncThread: new sl_action_seq 1197 - SYNC 34 2005-04-30 16:19:38 UTC DEBUG2 localListenThread: Received event 2,34 SYNC The subscription appears to have switched: avbrief=# select * from sl_subscribe; sub_set | sub_provider | sub_receiver | sub_forward | sub_active ---------+--------------+--------------+-------------+------------ 1 | 2 | 1 | t | t but nothing short of dropping and uninstalling node 1 (k) and reinstalling/subscribing gets it replicating again. The problem is repeatable after an init cluster. What happened please? Julian Scarfe
- Previous message: [Slony1-general] 1.1 Beta
- Next message: [Slony1-general] transaction still in progress
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list