Thu Jul 5 14:15:01 PDT 2007
- Previous message: [Slony1-general] New master failing; still trying to see old master?
- Next message: [Slony1-general] New master failing; still trying to see old master?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Jan; here's a quick look at what sorts of events are in sl_event. Pager usage is off. ev_origin | ev_seqno | ev_timestamp | ev_minxid | ev_maxxid | ev_xip | ev_type | ev_data1 | ev_data2 | ev_data3 | ev_data4 | ev_data5 | ev_data6 | ev_data7 | ev_data8 -----------+----------+--------------+-----------+-----------+--------+---------+----------+----------+----------+----------+----------+----------+----------+---------- (0 rows) Pager usage is off. ev_origin | ev_seqno | ev_timestamp | ev_minxid | ev_maxxid | ev_xip | ev_type | ev_data1 | ev_data2 | ev_data3 | ev_data4 | ev_data5 | ev_data6 | ev_data7 | ev_data8 -----------+----------+--------------+-----------+-----------+--------+---------+----------+----------+----------+----------+----------+----------+----------+---------- (0 rows) Pager usage is off. ev_origin | ev_seqno | ev_timestamp | ev_minxid | ev_maxxid | ev_xip | ev_type | ev_data1 | ev_data2 | ev_data3 | ev_data4 | ev_data5 | ev_data6 | ev_data7 | ev_data8 -----------+----------+--------------+-----------+-----------+--------+---------+----------+----------+----------+----------+----------+----------+----------+---------- (0 rows) Jan Wieck <JanWieck at Yahoo.com> writes: > On 7/5/2007 4:34 PM, Jerry Sievers wrote: > > > select * from sl_status on the three nodes still configured. > > Apparently node 1 didn't receive any events from 3 or 4 for over 5 > hours. Well, what does > > select * from sl_event where ev_type = 'ENABLE_NODE'; > > give you one all 3 nodes? > > > Jan > > > Please advise. Pager usage is off. > > Expanded display is on. > > -[ RECORD 1 ]-------------+----------------------------- > > st_origin | 1 > > st_received | 3 > > st_last_event | 2225235 > > st_last_event_ts | 05-JUL-07 15:54:54.810343 > > st_last_received | 2225131 > > st_last_received_ts | 05-JUL-07 15:16:23.496708 > > st_last_received_event_ts | 05-JUL-07 14:58:07.240334 > > st_lag_num_events | 104 > > st_lag_time | @ 5 hours 21 mins 45.53 secs > > -[ RECORD 2 ]-------------+----------------------------- > > st_origin | 1 > > st_received | 4 > > st_last_event | 2225235 > > st_last_event_ts | 05-JUL-07 15:54:54.810343 > > st_last_received | 2225131 > > st_last_received_ts | 05-JUL-07 15:14:07.409965 > > st_last_received_event_ts | 05-JUL-07 14:58:07.240334 > > st_lag_num_events | 104 > > st_lag_time | @ 5 hours 21 mins 45.53 secs > > Pager usage is off. > > Expanded display is on. > > -[ RECORD 1 ]-------------+----------------------------- > > st_origin | 3 > > st_received | 4 > > st_last_event | 1863901 > > st_last_event_ts | 05-JUL-07 18:21:49.29024 > > st_last_received | 1863896 > > st_last_received_ts | 05-JUL-07 18:21:04.101713 > > st_last_received_event_ts | 05-JUL-07 18:20:59.06034 > > st_lag_num_events | 5 > > st_lag_time | @ 2 hours 2 mins 21.48 secs > > -[ RECORD 2 ]-------------+----------------------------- > > st_origin | 3 > > st_received | 1 > > st_last_event | 1863901 > > st_last_event_ts | 05-JUL-07 18:21:49.29024 > > st_last_received | 1862809 > > st_last_received_ts | 05-JUL-07 14:57:21.461858 > > st_last_received_event_ts | 05-JUL-07 15:00:46.848899 > > st_lag_num_events | 1092 > > st_lag_time | @ 5 hours 22 mins 33.69 secs > > Pager usage is off. > > Expanded display is on. > > -[ RECORD 1 ]-------------+----------------------------- > > st_origin | 4 > > st_received | 1 > > st_last_event | 1864550 > > st_last_event_ts | 05-JUL-07 18:21:01.700228 > > st_last_received | 1863465 > > st_last_received_ts | 05-JUL-07 14:57:21.23512 > > st_last_received_event_ts | 05-JUL-07 15:00:49.830356 > > st_lag_num_events | 1085 > > st_lag_time | @ 5 hours 22 mins 33.96 secs > > -[ RECORD 2 ]-------------+----------------------------- > > st_origin | 4 > > st_received | 3 > > st_last_event | 1864550 > > st_last_event_ts | 05-JUL-07 18:21:01.700228 > > st_last_received | 1864550 > > st_last_received_ts | 05-JUL-07 18:20:56.67848 > > st_last_received_event_ts | 05-JUL-07 18:21:01.700228 > > st_lag_num_events | 0 > > st_lag_time | @ 2 hours 2 mins 22.09 secs > > Jan Wieck <JanWieck at Yahoo.com> writes: > > > >> On 7/5/2007 3:03 PM, Jerry Sievers wrote: > >> > Crisis today. Complete power failure leaves a corrupt table on > >> old > >> > master. I did moveset() and dropnode() to reconfigure the cluster. > >> > The old > >> > master was node 2. New master is node 1. There are now just 2 > >> > slaves 3 and 4. > >> > For some reason however, when I try to fire up the slon on the > >> > master, > >> > it complains of node #2 does not exist right after reporting having > >> > init'd node 4. I have no clue what's going wrong here and hope not > >> > to have to undo > >> > and reconfig the cluster from scratch. These DBs are too large now > >> > for easy subscription during live processing. Any help much > >> > appreciated. ----------------------------------------- > >> > 2007-07-05 18:19:18 GMT CONFIG main: edb-replication version 1.1.5 starting up > >> > 2007-07-05 18:19:19 GMT CONFIG main: local node id = 1 > >> > 2007-07-05 18:19:19 GMT CONFIG main: launching sched_start_mainloop > >> > 2007-07-05 18:19:19 GMT CONFIG main: loading current cluster configuration > >> > 2007-07-05 18:19:19 GMT CONFIG storeNode: no_id=3 no_comment='slave node 3' > >> > 2007-07-05 18:19:19 GMT CONFIG storeNode: no_id=4 no_comment='slave node 4' > >> > 2007-07-05 18:19:19 GMT CONFIG storePath: pa_server=3 pa_client=1 pa_conninfo="dbname=rt3_01 host=192.168.30.172 user=slonik password=foo.j1MiTikGop0rytQuedPid8 port=5432" pa_connretry=5 > >> > 2007-07-05 18:19:19 GMT CONFIG storePath: pa_server=4 pa_client=1 pa_conninfo="dbname=rt3_01 host=192.168.30.173 user=slonik password=foo.j1MiTikGop0rytQuedPid8 port=5432" pa_connretry=5 > >> > 2007-07-05 18:19:19 GMT CONFIG storeListen: li_origin=3 li_receiver=1 li_provider=3 > >> > 2007-07-05 18:19:19 GMT CONFIG storeListen: li_origin=4 li_receiver=1 li_provider=4 > >> > 2007-07-05 18:19:19 GMT CONFIG storeSet: set_id=1 set_origin=1 set_comment='RT3/VCASE replication set' > >> > 2007-07-05 18:19:19 GMT CONFIG storeSet: set_id=2 set_origin=1 set_comment='new set for adding tables' > >> > 2007-07-05 18:19:19 GMT CONFIG main: configuration complete - starting threads > >> > NOTICE: Slony-I: cleanup stale sl_nodelock entry for pid=12520 > >> > 2007-07-05 18:19:19 GMT CONFIG enableNode: no_id=3 > >> > 2007-07-05 18:19:19 GMT CONFIG enableNode: no_id=4 > >> > 2007-07-05 18:19:19 GMT FATAL enableNode: unknown node ID 2 > >> > 2007-07-05 18:19:19 GMT INFO remoteListenThread_4: disconnecting from 'dbname=rt3_01 host=192.168.30.173 user=slonik password=foo.j1MiTikGop0rytQuedPid8 port=5432' > >> > 2007-07-05 18:19:20 GMT INFO remoteListenThread_3: disconnecting from 'dbname=rt3_01 host=192.168.30.172 user=slonik password=foo.j1MiTikGop0rytQuedPid8 port=5432' > >> > > >> It appears that there is an ENABLE_NODE event on either node 3 or 4 > >> which node 1 tries to replicate. How that could have been lurking > >> around there forever is another question though. > >> What is the content of sl_status for all three nodes? > >> Also, you now might want to change the password for user slony on > >> those servers ;-) > >> Jan > >> -- > >> #======================================================================# > >> # It's easier to get forgiveness for being wrong than for being right. # > >> # Let's break this rule - forgive me. # > >> #================================================== JanWieck at Yahoo.com # > >> > > > > > -- > #======================================================================# > # It's easier to get forgiveness for being wrong than for being right. # > # Let's break this rule - forgive me. # > #================================================== JanWieck at Yahoo.com # > -- ------------------------------------------------------------------------------- Jerry Sievers 732 365-2844 (work) Production Database Administrator 305 321-1144 (mobil WWW E-Commerce Consultant
- Previous message: [Slony1-general] New master failing; still trying to see old master?
- Next message: [Slony1-general] New master failing; still trying to see old master?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list