[Slony1-general] Drop Node command works, but leaves crumbs behind.

Mon May 10 19:57:18 PDT 2010

On 5/10/2010 6:07 PM, Brian Fehrle wrote:
> Hi all,
>     I've been running into a problem with dropping a node from the slony 
> cluster, in which the slony system catalogs aren't getting fully cleaned 
> up upon the dropping of the node.
> 
>     I have a three node cluster, one master and two slaves. I have a 
> script that will generate the slonik command that will drop one of the 
> slaves (in this case node three) from the slony cluster and it executes 
> without problem. However, after preforming the drop node a few dozen 
> times, there have been several instances in which the data in 
> _slony.sl_status still refers to a third node, and the st_lag_num_events 
> climb and climb (since there's no node to sync with, it will never drop 
> to 0).
> 
> So the problem is after I drop a node, everything looks great except for 
> the _slony.sl_status table, in any or all the remaining nodes, still 
> refers to the node that was just dropped.
> 
> I did quite a few test runs of the drop node to try to reproduce and 
> determine the cause. After the drop node, if I look in sl_node, sl_path, 
> sl_event, or any other sl_ location, I see no reference to the third 
> node. However, about half the time I would still get references to the 
> third node in sl_status. This can either be on the master node, or the 
> (remaining) slave node, or both. There was one test scenario that I 
> monitored the sl_status table and noticed that node 3 disappeared, then 
> reappeared a second later, then remained.
> 
> Example queries done on node 2 (slave) after dropping node 3 (other slave):
> 
> postgres=# select * from _slony.sl_node;
>  no_id | no_active | no_comment | no_spool
> -------+-----------+------------+----------
>      1 | t         | Server 1   | f
>      2 | t         | Server 2   | f
> (2 rows)
> 
> postgres=# select * from _slony.sl_path ;
>  pa_server | pa_client |                        
> pa_conninfo                         | pa_connretry
> -----------+-----------+------------------------------------------------------------+--------------
>          1 |         2 | dbname=postgres host=172.16.44.111 port=5432 
> user=postgres |           10
>          2 |         1 | dbname=postgres host=172.16.44.129 port=5432 
> user=postgres |           10
> (2 rows)
> 
> postgres=# select * from _slony.sl_status;
>  st_origin | st_received | st_last_event |      st_last_event_ts      | 
> st_last_received |    st_last_received_ts     | 
> st_last_received_event_ts  | st_lag_num_events |   st_lag_time
> -----------+-------------+---------------+----------------------------+------------------+----------------------------+----------------------------+-------------------+-----------------
>          2 |           1 |          1649 | 2010-05-10 15:53:16.245529 
> |             1649 | 2010-05-10 15:53:16.246212 | 2010-05-10 
> 15:53:16.245529 |                 0 | 00:00:05.57205
>          2 |           3 |          1656 | 2010-05-10 15:54:26.280131 
> |             1636 | 2010-05-10 15:51:05.341512 | 2010-05-10 
> 15:51:05.343754 |                20 | 00:03:22.66664

Since sl_status is a view, the crumbs must be somewhere else. I suspect 
sl_confirm.

There is probably some race condition between dropping the node and 
receiving confirmations from the (now dropped) node. We should add some 
extra cleanup to the cleanup thread as well as STORE NODE to make sure 
things are clean before someone recreates a node with that id.

> Also, another problem that may be linked is the fact that the slon 
> daemon for node 3 does not terminate itself after it. Watching the log 
> output by that daemon, it shows that it recieves the drop node command 
> for itself, and it drops the _slony schema as intended. However after 
> that it reports "2010-05-10 15:57:56 MDT FATAL  main: Node is not 
> initialized properly - sleep 10s" and keeps checking every ten seconds. 
> I'm not sure if somehow this daemon is causing some post-drop-node 
> entries into the sl_event section that causes the sl_status entry to be 
> recreated.

Not sure if terminating that slon by default is the right answer. If you 
are off by one with STORE NODE and starting the slon, you will pile up 
sl_log data and clog the whole cluster until you notice (yes, there are 
people who start the slon before running the STORE NODE).

Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin