Fri Jul 8 12:27:15 PDT 2016
- Next message: [Slony1-general] drop node error
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello slony group, I’m testing now with slony1-2.2.4. I have just recently produced an error which effectively stops slon processing on some node A due to some node B being dropped. The event reproduces only infrequently. As some will know, a slon daemon for a given node which becomes aware its node has been dropped will respond by dropping its cluster schema. There appears to be a race condition between the node B schema drop and the (surviving) node A receipt of the disableNode (drop node) event. If the former occurs before the latter, all the remote worker threads on node A enter an error state. See the log samples below. I resolved this the first time by deleting all the recent non-SYNC events from the sl_event tables, and more recently with a simple node A slon restart. Please advise if there is any ticket I should provide this info to, or if I should create a new one. Thanks. ---- node 1 log ---- 2016-07-08 18:06:31 UTC [30382] INFO remoteWorkerThread_999999: SYNC 5000000008 done in 0.002 seconds 2016-07-08 18:06:33 UTC [30382] INFO remoteWorkerThread_999999: SYNC 5000000009 done in 0.002 seconds 2016-07-08 18:06:33 UTC [30382] INFO remoteWorkerThread_2: SYNC 5000017869 done in 0.002 seconds 2016-07-08 18:06:33 UTC [30382] INFO remoteWorkerThread_3: SYNC 5000018148 done in 0.004 seconds 2016-07-08 18:06:45 UTC [30382] CONFIG remoteWorkerThread_2: update provider configuration 2016-07-08 18:06:45 UTC [30382] ERROR remoteWorkerThread_3: "select last_value from "_ams_cluster".sl_log_status" PGRES_FATAL_ERROR ERROR: schema "_ams_clu\ ster" does not exist LINE 1: select last_value from "_ams_cluster".sl_log_status ^ 2016-07-08 18:06:45 UTC [30382] ERROR remoteWorkerThread_3: SYNC aborted 2016-07-08 18:06:45 UTC [30382] CONFIG version for "dbname=ams host=198.18.102.45 user=ams_slony sslmode=verify-ca sslcert=/usr/local/akamai/.ams_certs/complete-ams_slony.crt sslkey=/usr/local/akamai/.ams_certs/ams_slony.private_key sslrootcert=/usr/local/akamai/etc/ssl_ca/canonical_ca_roots.pem" is 90119 2016-07-08 18:06:45 UTC [30382] ERROR remoteWorkerThread_2: "select last_value from "_ams_cluster".sl_log_status" PGRES_FATAL_ERROR ERROR: schema "_ams_clu\ ster" does not exist LINE 1: select last_value from "_ams_cluster".sl_log_status ^ 2016-07-08 18:06:45 UTC [30382] ERROR remoteWorkerThread_2: SYNC aborted 2016-07-08 18:06:45 UTC [30382] ERROR remoteListenThread_999999: "select ev_origin, ev_seqno, ev_timestamp, ev_snapshot, "pg_catalog".txid_sna\ pshot_xmin(ev_snapshot), "pg_catalog".txid_snapshot_xmax(ev_snapshot), ev_type, ev_data1, ev_data2, ev_data3, ev_data4, ev\ _data5, ev_data6, ev_data7, ev_data8 from "_ams_cluster".sl_event e where (e.ev_origin = '999999' and e.ev_seqno > '5000000009') or (e.ev_origin = '2'\ and e.ev_seqno > '5000017870') or (e.ev_origin = '3' and e.ev_seqno > '5000018151') order by e.ev_origin, e.ev_seqno limit 40" - ERROR: schema "_ams_cluste\ r" does not exist LINE 1: ...v_data5, ev_data6, ev_data7, ev_data8 from "_ams_clus... ^ 2016-07-08 18:06:55 UTC [30382] ERROR remoteWorkerThread_3: "start transaction; set enable_seqscan = off; set enable_indexscan = on; " PGRES_FATAL_ERROR ERR\ OR: current transaction is aborted, commands ignored until end of transaction block 2016-07-08 18:06:55 UTC [30382] ERROR remoteWorkerThread_3: SYNC aborted 2016-07-08 18:06:55 UTC [30382] ERROR remoteWorkerThread_2: "start transaction; set enable_seqscan = off; set enable_indexscan = on; " PGRES_FATAL_ERROR ERR\ OR: current transaction is aborted, commands ignored until end of transaction block 2016-07-08 18:06:55 UTC [30382] ERROR remoteWorkerThread_2: SYNC aborted ---- ---- node 999999 log ---- 2016-07-08 18:06:44 UTC [558] INFO remoteWorkerThread_1: SYNC 5000081216 done in 0.004 seconds 2016-07-08 18:06:44 UTC [558] INFO remoteWorkerThread_2: SYNC 5000017870 done in 0.004 seconds 2016-07-08 18:06:44 UTC [558] INFO remoteWorkerThread_3: SYNC 5000018150 done in 0.004 seconds 2016-07-08 18:06:44 UTC [558] INFO remoteWorkerThread_1: SYNC 5000081217 done in 0.003 seconds 2016-07-08 18:06:44 UTC [558] WARN remoteWorkerThread_3: got DROP NODE for local node ID NOTICE: Slony-I: Please drop schema "_ams_cluster" NOTICE: drop cascades to 171 other objects DETAIL: drop cascades to table _ams_cluster.sl_node drop cascades to table _ams_cluster.sl_nodelock drop cascades to table _ams_cluster.sl_set drop cascades to table _ams_cluster.sl_setsync drop cascades to table _ams_cluster.sl_table ---- Tom ☺ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.slony.info/pipermail/slony1-general/attachments/20160708/bd39c8a5/attachment.htm
- Next message: [Slony1-general] drop node error
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list