Thu Jul 5 14:52:34 PDT 2007
- Previous message: [Slony1-general] New master failing; still trying to see old master?
- Next message: [Slony1-general] New master failing; still trying to see old master?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 7/5/2007 5:22 PM, Jerry Sievers wrote:
> Selecting all non-sync events from each of the 3 nodes ordered by
> ev_seqno.
I think I see what's going on here ... maybe.
This is probably a pilot error in connection with a copy/paste mistake
sitting in slon for ages.
The copy/paste mistake is:
the error message in disableNode() says "enableNode(): ...".
I claim ownership of that one.
The pilot error is:
the dropnode() was issued multiple times against different nodes
without giving them time to propagate (in this case nodes 1 and 4).
They are events (1,2225224) and (4,1863698).
Nice screwup. However since all 3 nodes don't have node 2 in the sl_node
table any more (at least from what I see they should not), it is safe to
DELETE FROM sl_event WHERE ev_origin = 4 and ev_seqno = 1863698;
DELETE FROM sl_event WHERE ev_origin = 1 and ev_seqno = 2225224;
Jan
>
> Thanks!
>
>
> Pager usage is off.
> Expanded display is on.
> -[ RECORD 1 ]+------------------------------------------------------------------------
> ev_origin | 1
> ev_seqno | 2225126
> ev_timestamp | 05-JUL-07 14:57:16.056801
> ev_minxid | 884391402
> ev_maxxid | 884391412
> ev_xip | '884391409','884391411'
> ev_type | ACCEPT_SET
> ev_data1 | 1
> ev_data2 | 2
> ev_data3 | 1
> ev_data4 |
> ev_data5 |
> ev_data6 |
> ev_data7 |
> ev_data8 |
> -[ RECORD 2 ]+------------------------------------------------------------------------
> ev_origin | 1
> ev_seqno | 2225133
> ev_timestamp | 05-JUL-07 14:58:26.439281
> ev_minxid | 884391608
> ev_maxxid | 884391609
> ev_xip |
> ev_type | ACCEPT_SET
> ev_data1 | 2
> ev_data2 | 2
> ev_data3 | 1
> ev_data4 |
> ev_data5 |
> ev_data6 |
> ev_data7 |
> ev_data8 |
> -[ RECORD 3 ]+------------------------------------------------------------------------
> ev_origin | 1
> ev_seqno | 2225224
> ev_timestamp | 05-JUL-07 15:49:54.253471
> ev_minxid | 884528335
> ev_maxxid | 884697167
> ev_xip | '884528335','884697160','884697162','884697161','884587782','884697166'
> ev_type | DROP_NODE
> ev_data1 | 2
> ev_data2 |
> ev_data3 |
> ev_data4 |
> ev_data5 |
> ev_data6 |
> ev_data7 |
> ev_data8 |
>
> Pager usage is off.
> Expanded display is on.
> -[ RECORD 1 ]+--------------------------
> ev_origin | 4
> ev_seqno | 1863698
> ev_timestamp | 05-JUL-07 15:52:40.518681
> ev_minxid | 385609088
> ev_maxxid | 385609089
> ev_xip |
> ev_type | DROP_NODE
> ev_data1 | 2
> ev_data2 |
> ev_data3 |
> ev_data4 |
> ev_data5 |
> ev_data6 |
> ev_data7 |
> ev_data8 |
>
> Pager usage is off.
> Expanded display is on.
> -[ RECORD 1 ]+--------------------------
> ev_origin | 4
> ev_seqno | 1863698
> ev_timestamp | 05-JUL-07 15:52:40.518681
> ev_minxid | 385609088
> ev_maxxid | 385609089
> ev_xip |
> ev_type | DROP_NODE
> ev_data1 | 2
> ev_data2 |
> ev_data3 |
> ev_data4 |
> ev_data5 |
> ev_data6 |
> ev_data7 |
> ev_data8 |
>
>
>
> Jan Wieck <JanWieck at Yahoo.com> writes:
>
>> On 7/5/2007 3:03 PM, Jerry Sievers wrote:
>>
>> > Crisis today. Complete power failure leaves a corrupt table on old
>> > master. I did moveset() and dropnode() to reconfigure the cluster.
>> > The old
>> > master was node 2. New master is node 1. There are now just 2
>> > slaves 3 and 4.
>>
>> Another question: Did you wait for the moveset() to propagate before
>> you dropped node 2?
>>
>>
>> Jan
>>
>> > For some reason however, when I try to fire up the slon on the
>> > master,
>> > it complains of node #2 does not exist right after reporting having
>> > init'd node 4. I have no clue what's going wrong here and hope not
>> > to have to undo
>> > and reconfig the cluster from scratch. These DBs are too large now
>> > for easy subscription during live processing. Any help much
>> > appreciated. -----------------------------------------
>> > 2007-07-05 18:19:18 GMT CONFIG main: edb-replication version 1.1.5 starting up
>> > 2007-07-05 18:19:19 GMT CONFIG main: local node id = 1
>> > 2007-07-05 18:19:19 GMT CONFIG main: launching sched_start_mainloop
>> > 2007-07-05 18:19:19 GMT CONFIG main: loading current cluster configuration
>> > 2007-07-05 18:19:19 GMT CONFIG storeNode: no_id=3 no_comment='slave node 3'
>> > 2007-07-05 18:19:19 GMT CONFIG storeNode: no_id=4 no_comment='slave node 4'
>> > 2007-07-05 18:19:19 GMT CONFIG storePath: pa_server=3 pa_client=1 pa_conninfo="dbname=rt3_01 host=192.168.30.172 user=slonik password=foo.j1MiTikGop0rytQuedPid8 port=5432" pa_connretry=5
>> > 2007-07-05 18:19:19 GMT CONFIG storePath: pa_server=4 pa_client=1 pa_conninfo="dbname=rt3_01 host=192.168.30.173 user=slonik password=foo.j1MiTikGop0rytQuedPid8 port=5432" pa_connretry=5
>> > 2007-07-05 18:19:19 GMT CONFIG storeListen: li_origin=3 li_receiver=1 li_provider=3
>> > 2007-07-05 18:19:19 GMT CONFIG storeListen: li_origin=4 li_receiver=1 li_provider=4
>> > 2007-07-05 18:19:19 GMT CONFIG storeSet: set_id=1 set_origin=1 set_comment='RT3/VCASE replication set'
>> > 2007-07-05 18:19:19 GMT CONFIG storeSet: set_id=2 set_origin=1 set_comment='new set for adding tables'
>> > 2007-07-05 18:19:19 GMT CONFIG main: configuration complete - starting threads
>> > NOTICE: Slony-I: cleanup stale sl_nodelock entry for pid=12520
>> > 2007-07-05 18:19:19 GMT CONFIG enableNode: no_id=3
>> > 2007-07-05 18:19:19 GMT CONFIG enableNode: no_id=4
>> > 2007-07-05 18:19:19 GMT FATAL enableNode: unknown node ID 2
>> > 2007-07-05 18:19:19 GMT INFO remoteListenThread_4: disconnecting from 'dbname=rt3_01 host=192.168.30.173 user=slonik password=foo.j1MiTikGop0rytQuedPid8 port=5432'
>> > 2007-07-05 18:19:20 GMT INFO remoteListenThread_3: disconnecting from 'dbname=rt3_01 host=192.168.30.172 user=slonik password=foo.j1MiTikGop0rytQuedPid8 port=5432'
>> >
>>
>>
>> --
>> #======================================================================#
>> # It's easier to get forgiveness for being wrong than for being right. #
>> # Let's break this rule - forgive me. #
>> #================================================== JanWieck at Yahoo.com #
>>
>
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck at Yahoo.com #
- Previous message: [Slony1-general] New master failing; still trying to see old master?
- Next message: [Slony1-general] New master failing; still trying to see old master?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list