Tue Jul 12 05:23:10 PDT 2016
- Previous message: [Slony1-general] drop node error
- Next message: [Slony1-general] drop node error
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 07/08/2016 03:27 PM, Tignor, Tom wrote: > Hello slony group, > > I’m testing now with slony1-2.2.4. I have just recently > produced an error which effectively stops slon processing on some node A > due to some node B being dropped. The event reproduces only > infrequently. As some will know, a slon daemon for a given node which > becomes aware its node has been dropped will respond by dropping its > cluster schema. There appears to be a race condition between the node B > schema drop and the (surviving) node A receipt of the disableNode (drop > node) event. If the former occurs before the latter, all the remote > worker threads on node A enter an error state. See the log samples > below. I resolved this the first time by deleting all the recent > non-SYNC events from the sl_event tables, and more recently with a > simple node A slon restart. > > Please advise if there is any ticket I should provide > this info to, or if I should create a new one. Thanks. > The Slony bug tracker is at http://bugs.slony.info/bugzilla/ I assume your saying that when the slon restart it keeps hitting this error and keeps restarting. > ---- node 1 log ---- > > 2016-07-08 18:06:31 UTC [30382] INFO remoteWorkerThread_999999: SYNC > 5000000008 done in 0.002 seconds > > 2016-07-08 18:06:33 UTC [30382] INFO remoteWorkerThread_999999: SYNC > 5000000009 done in 0.002 seconds > > 2016-07-08 18:06:33 UTC [30382] INFO remoteWorkerThread_2: SYNC > 5000017869 done in 0.002 seconds > > 2016-07-08 18:06:33 UTC [30382] INFO remoteWorkerThread_3: SYNC > 5000018148 done in 0.004 seconds > > 2016-07-08 18:06:45 UTC [30382] CONFIG remoteWorkerThread_2: update > provider configuration > > 2016-07-08 18:06:45 UTC [30382] ERROR remoteWorkerThread_3: "select > last_value from "_ams_cluster".sl_log_status" PGRES_FATAL_ERROR ERROR: > schema "_ams_clu\ > > ster" does not exist > > LINE 1: select last_value from "_ams_cluster".sl_log_status > > ^ > > 2016-07-08 18:06:45 UTC [30382] ERROR remoteWorkerThread_3: SYNC aborted > > 2016-07-08 18:06:45 UTC [30382] CONFIG version for "dbname=ams > > host=198.18.102.45 > > user=ams_slony > > sslmode=verify-ca > > sslcert=/usr/local/akamai/.ams_certs/complete-ams_slony.crt > > sslkey=/usr/local/akamai/.ams_certs/ams_slony.private_key > > sslrootcert=/usr/local/akamai/etc/ssl_ca/canonical_ca_roots.pem" > is 90119 > > 2016-07-08 18:06:45 UTC [30382] ERROR remoteWorkerThread_2: "select > last_value from "_ams_cluster".sl_log_status" PGRES_FATAL_ERROR ERROR: > schema "_ams_clu\ > > ster" does not exist > > LINE 1: select last_value from "_ams_cluster".sl_log_status > > ^ > > 2016-07-08 18:06:45 UTC [30382] ERROR remoteWorkerThread_2: SYNC aborted > > 2016-07-08 18:06:45 UTC [30382] ERROR remoteListenThread_999999: > "select ev_origin, ev_seqno, ev_timestamp, ev_snapshot, > "pg_catalog".txid_sna\ > > pshot_xmin(ev_snapshot), > "pg_catalog".txid_snapshot_xmax(ev_snapshot), ev_type, > ev_data1, ev_data2, ev_data3, ev_data4, ev\ > > _data5, ev_data6, ev_data7, ev_data8 from "_ams_cluster".sl_event > e where (e.ev_origin = '999999' and e.ev_seqno > '5000000009') or > (e.ev_origin = '2'\ > > and e.ev_seqno > '5000017870') or (e.ev_origin = '3' and e.ev_seqno > > '5000018151') order by e.ev_origin, e.ev_seqno limit 40" - ERROR: > schema "_ams_cluste\ > > r" does not exist > > LINE 1: ...v_data5, ev_data6, ev_data7, ev_data8 from "_ams_clus... > > ^ > > 2016-07-08 18:06:55 UTC [30382] ERROR remoteWorkerThread_3: "start > transaction; set enable_seqscan = off; set enable_indexscan = on; " > PGRES_FATAL_ERROR ERR\ > > OR: current transaction is aborted, commands ignored until end of > transaction block > > 2016-07-08 18:06:55 UTC [30382] ERROR remoteWorkerThread_3: SYNC aborted > > 2016-07-08 18:06:55 UTC [30382] ERROR remoteWorkerThread_2: "start > transaction; set enable_seqscan = off; set enable_indexscan = on; " > PGRES_FATAL_ERROR ERR\ > > OR: current transaction is aborted, commands ignored until end of > transaction block > > 2016-07-08 18:06:55 UTC [30382] ERROR remoteWorkerThread_2: SYNC aborted > > ---- > > ---- node 999999 log ---- > > 2016-07-08 18:06:44 UTC [558] INFO remoteWorkerThread_1: SYNC > 5000081216 done in 0.004 seconds > > 2016-07-08 18:06:44 UTC [558] INFO remoteWorkerThread_2: SYNC > 5000017870 done in 0.004 seconds > > 2016-07-08 18:06:44 UTC [558] INFO remoteWorkerThread_3: SYNC > 5000018150 done in 0.004 seconds > > 2016-07-08 18:06:44 UTC [558] INFO remoteWorkerThread_1: SYNC > 5000081217 done in 0.003 seconds > > 2016-07-08 18:06:44 UTC [558] WARN remoteWorkerThread_3: got DROP NODE > for local node ID > > NOTICE: Slony-I: Please drop schema "_ams_cluster" > > NOTICE: drop cascades to 171 other objects > > DETAIL: drop cascades to table _ams_cluster.sl_node > > drop cascades to table _ams_cluster.sl_nodelock > > drop cascades to table _ams_cluster.sl_set > > drop cascades to table _ams_cluster.sl_setsync > > drop cascades to table _ams_cluster.sl_table > > ---- > > Tom J > > > > _______________________________________________ > Slony1-general mailing list > Slony1-general at lists.slony.info > http://lists.slony.info/mailman/listinfo/slony1-general >
- Previous message: [Slony1-general] drop node error
- Next message: [Slony1-general] drop node error
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list