elein elein
Tue Aug 23 02:35:44 PDT 2005
replying to my own message, added the error message
from dropping the dead node from node30...

On Mon, Aug 22, 2005 at 05:12:19PM -0700, elein wrote:
> Slony 1.1.  Three nodes. 10 set(1) => 20 => 30.
> 
> I ran failover from node10 to node20.
> 
> On node30, the origin of the set was changed 
> from 10 to 20, however, drop node10 failed
> because of the row in sl_setsync.
> 
> This causes slon on node30 to quit and the cluster to 
> become unstable.  Which in turn prevents putting
> node10 back into the mix.
> 
> Please tell me I'm not the first one to run into
> this...
> 
> The only clean work around I can see is to drop
> node 30. Re-add it. And then re-add node10.  This
> leaves us w/o a back up for the downtime.
> 
> 
> This is what is in some of the tables for node20:
> 
> gb2=# select * from sl_node;
>  no_id | no_active |       no_comment        | no_spool
> -------+-----------+-------------------------+----------
>     20 | t         | Node 20 - gb2 at localhost | f
>     30 | t         | Node 30 - gb3 at localhost | f
> (2 rows)
> 
> gb2=# select * from sl_set;
>  set_id | set_origin | set_locked |     set_comment
> --------+------------+------------+----------------------
>       1 |         20 |            | Set 1 for gb_cluster
> gb2=# select * from sl_setsync;
>  ssy_setid | ssy_origin | ssy_seqno | ssy_minxid | ssy_maxxid | ssy_xip | ssy_action_list
> -----------+------------+-----------+------------+------------+---------+-----------------
> (0 rows)
> 
> This is what I have for node30:
> 
> gb3=# select * from sl_node;
>  no_id | no_active |       no_comment        | no_spool
> -------+-----------+-------------------------+----------
>     10 | t         | Node 10 - gb at localhost  | f
>     20 | t         | Node 20 - gb2 at localhost | f
>     30 | t         | Node 30 - gb3 at localhost | f
> (3 rows)
> 
> gb3=# select * from sl_set;
>  set_id | set_origin | set_locked |     set_comment
> --------+------------+------------+----------------------
>       1 |         20 |            | Set 1 for gb_cluster
> (1 row)
> 
> gb3=# select * from sl_setsync;
>  ssy_setid | ssy_origin | ssy_seqno | ssy_minxid | ssy_maxxid | ssy_xip | ssy_action_list
> -----------+------------+-----------+------------+------------+---------+-----------------
>          1 |         10 |       235 | 1290260    | 1290261    |         |
> (1 row)
> 
> frustrated,
> --elein

drop_node10.slnk:5: PGRES_FATAL_ERROR select "_gb_cluster".dropNode(10);  - ERROR:  update or delete on "sl_node" violates foreign key constraint "ssy_origin-no_id-ref" on "sl_setsync"
DETAIL:  Key (no_id)=(10) is still referenced from table "sl_setsync".
CONTEXT:  SQL statement "delete from "_gb_cluster".sl_node where no_id =  $1 "
PL/pgSQL function "dropnode_int" line 29 at SQL statement
SQL statement "SELECT  "_gb_cluster".dropNode_int( $1 )"
PL/pgSQL function "dropnode" line 47 at perform
drop_node10.slnk:7: Failed to drop node 10 from cluster



More information about the Slony1-general mailing list