[Slony1-general] A getlocalnodeid bug?

Tue Dec 28 10:24:23 PST 2010

On Thu, 23 Dec 2010, Zhidong She wrote:

> The cached query plan is not same as cached data, right? I think the data (node id) should be
> new since I totally destroied the cluster and rebuilt it.

Right the plan is cached not the data.

> 
> The purpose of uninstall and rebuild the cluster is that we want to add the failure node back
> to cluster and let it be a new slave node rahter than master original.
> Is there other method can implement rejoining the cluster?

You can drop the old node (DROP NODE) command and create a new node (STORE 
NODE) replicate to the new node then use (MOVE SET) to move the master to 
the restored node.

> 
> On Wed, Dec 22, 2010 at 10:57 PM, Steve Singer <ssinger at ca.afilias.info> wrote:
>       On 10-12-22 02:45 AM, Zhidong She wrote:
>             Hi all,
> 
>
>       The old connection has a cached query plan for the getLocalNodeId() function.
>
>       If you uninstall a slony node any database connections that were open from before
>       slony might not behave as expected.
> 
> 
> 
> 
> 
> We are using slonyI 2.0.5 to build our master/slave postgres HA solution
> for our project. So far slonyI works fine except for a strange
> getlocalnodeid issue. Could somebody please help to figure me out?
> 
> The scenario is, we have host A as master (define as node 1 in slonic),
> host B (node 2) as slave at beginning. The data was replicated to host B
> from A by slony.
> Then we stop postgres on host A, and in host B run failover command and
> drop node A from cluster. Thus the db service was switch to host B.
> 
> Establish a connection to host B via psql, and execute "select
> _cluster.getlocalnodeid('_cluster')", it returns 2. Keep this connection
> for a while.
> 
> So far everything is good.
> 
> Stop all slon process on host A and B.
> Then start the postgres server on host A, and in both host A and host B,
> execute "drop schema _cluster cascade" to clean all the cluster information.
> 
> Build the cluster again, this time host B as master (node 1), host A as
> slave (node 2), and start the slon in both nodes.
> 
> Setup a new connection to host B and execute same "select
> _cluster.getlocalnodeid('_cluster')", it returns 1.
> But in the old connection we kept before, this command still return 2.
> At the same time, in old connection, the update/insert/delete operation
> can not be synchronized to host A.
> It's strange that the cluster is rebuilt and slon is restarted but why
> only old connection the localnodeid is still 2 and can not be
> resynchronized?
> 
> Anyone idea?
> 
> Thanks very much for your time!
> 
> Br,
> Zhidong
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general
> 
> 
> 
> 
>