[Slony1-general] drop node error

Tue Jul 12 05:23:10 PDT 2016

On 07/08/2016 03:27 PM, Tignor, Tom wrote:
>                  Hello slony group,
>
>                  I’m testing now with slony1-2.2.4. I have just recently
> produced an error which effectively stops slon processing on some node A
> due to some node B being dropped. The event reproduces only
> infrequently. As some will know, a slon daemon for a given node which
> becomes aware its node has been dropped will respond by dropping its
> cluster schema. There appears to be a race condition between the node B
> schema drop and the (surviving) node A receipt of the disableNode (drop
> node) event. If the former occurs before the latter, all the remote
> worker threads on node A enter an error state. See the log samples
> below. I resolved this the first time by deleting all the recent
> non-SYNC events from the sl_event tables, and more recently with a
> simple node A slon restart.
>
>                  Please advise if there is any ticket I should provide
> this info to, or if I should create a new one. Thanks.
>

The Slony bug tracker is at
http://bugs.slony.info/bugzilla/

I assume your saying that when the slon restart it keeps hitting this 
error and keeps restarting.

> ---- node 1 log ----
>
> 2016-07-08 18:06:31 UTC [30382] INFO   remoteWorkerThread_999999: SYNC
> 5000000008 done in 0.002 seconds
>
> 2016-07-08 18:06:33 UTC [30382] INFO   remoteWorkerThread_999999: SYNC
> 5000000009 done in 0.002 seconds
>
> 2016-07-08 18:06:33 UTC [30382] INFO   remoteWorkerThread_2: SYNC
> 5000017869 done in 0.002 seconds
>
> 2016-07-08 18:06:33 UTC [30382] INFO   remoteWorkerThread_3: SYNC
> 5000018148 done in 0.004 seconds
>
> 2016-07-08 18:06:45 UTC [30382] CONFIG remoteWorkerThread_2: update
> provider configuration
>
> 2016-07-08 18:06:45 UTC [30382] ERROR  remoteWorkerThread_3: "select
> last_value from "_ams_cluster".sl_log_status" PGRES_FATAL_ERROR ERROR:
> schema "_ams_clu\
>
> ster" does not exist
>
> LINE 1: select last_value from "_ams_cluster".sl_log_status
>
>                                 ^
>
> 2016-07-08 18:06:45 UTC [30382] ERROR  remoteWorkerThread_3: SYNC aborted
>
> 2016-07-08 18:06:45 UTC [30382] CONFIG version for "dbname=ams
>
>        host=198.18.102.45
>
>        user=ams_slony
>
>        sslmode=verify-ca
>
>        sslcert=/usr/local/akamai/.ams_certs/complete-ams_slony.crt
>
>        sslkey=/usr/local/akamai/.ams_certs/ams_slony.private_key
>
>        sslrootcert=/usr/local/akamai/etc/ssl_ca/canonical_ca_roots.pem"
> is 90119
>
> 2016-07-08 18:06:45 UTC [30382] ERROR  remoteWorkerThread_2: "select
> last_value from "_ams_cluster".sl_log_status" PGRES_FATAL_ERROR ERROR:
> schema "_ams_clu\
>
> ster" does not exist
>
> LINE 1: select last_value from "_ams_cluster".sl_log_status
>
>                                 ^
>
> 2016-07-08 18:06:45 UTC [30382] ERROR  remoteWorkerThread_2: SYNC aborted
>
> 2016-07-08 18:06:45 UTC [30382] ERROR  remoteListenThread_999999:
> "select ev_origin, ev_seqno, ev_timestamp,        ev_snapshot,
> "pg_catalog".txid_sna\
>
> pshot_xmin(ev_snapshot),
> "pg_catalog".txid_snapshot_xmax(ev_snapshot),        ev_type,
>   ev_data1, ev_data2,        ev_data3, ev_data4,        ev\
>
> _data5, ev_data6,        ev_data7, ev_data8 from "_ams_cluster".sl_event
> e where (e.ev_origin = '999999' and e.ev_seqno > '5000000009') or
> (e.ev_origin = '2'\
>
> and e.ev_seqno > '5000017870') or (e.ev_origin = '3' and e.ev_seqno >
> '5000018151') order by e.ev_origin, e.ev_seqno limit 40" - ERROR:
> schema "_ams_cluste\
>
> r" does not exist
>
> LINE 1: ...v_data5, ev_data6,        ev_data7, ev_data8 from "_ams_clus...
>
>                                                               ^
>
> 2016-07-08 18:06:55 UTC [30382] ERROR  remoteWorkerThread_3: "start
> transaction; set enable_seqscan = off; set enable_indexscan = on; "
> PGRES_FATAL_ERROR ERR\
>
> OR:  current transaction is aborted, commands ignored until end of
> transaction block
>
> 2016-07-08 18:06:55 UTC [30382] ERROR  remoteWorkerThread_3: SYNC aborted
>
> 2016-07-08 18:06:55 UTC [30382] ERROR  remoteWorkerThread_2: "start
> transaction; set enable_seqscan = off; set enable_indexscan = on; "
> PGRES_FATAL_ERROR ERR\
>
> OR:  current transaction is aborted, commands ignored until end of
> transaction block
>
> 2016-07-08 18:06:55 UTC [30382] ERROR  remoteWorkerThread_2: SYNC aborted
>
> ----
>
> ---- node 999999 log ----
>
> 2016-07-08 18:06:44 UTC [558] INFO   remoteWorkerThread_1: SYNC
> 5000081216 done in 0.004 seconds
>
> 2016-07-08 18:06:44 UTC [558] INFO   remoteWorkerThread_2: SYNC
> 5000017870 done in 0.004 seconds
>
> 2016-07-08 18:06:44 UTC [558] INFO   remoteWorkerThread_3: SYNC
> 5000018150 done in 0.004 seconds
>
> 2016-07-08 18:06:44 UTC [558] INFO   remoteWorkerThread_1: SYNC
> 5000081217 done in 0.003 seconds
>
> 2016-07-08 18:06:44 UTC [558] WARN   remoteWorkerThread_3: got DROP NODE
> for local node ID
>
> NOTICE:  Slony-I: Please drop schema "_ams_cluster"
>
> NOTICE:  drop cascades to 171 other objects
>
> DETAIL:  drop cascades to table _ams_cluster.sl_node
>
> drop cascades to table _ams_cluster.sl_nodelock
>
> drop cascades to table _ams_cluster.sl_set
>
> drop cascades to table _ams_cluster.sl_setsync
>
> drop cascades to table _ams_cluster.sl_table
>
> ----
>
>              Tom J
>
>
>
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general
>