Craig James craig_james at emolecules.com
Wed Apr 2 23:00:34 PDT 2008
Jeff Frost wrote:
> On Wed, 2 Apr 2008, Craig James wrote:
> 
>> Jeff Frost wrote:
>>> On Wed, 2 Apr 2008, Craig James wrote:
>>>
>>>> I keep getting this error:
>>>>
>>>>  remoteListenThread_1: db_getLocalNodeId() returned 2 - wrong database?
>>>>
>>>> I suspect the slon daemon can't handle port numbers correctly.  In 
>>>> this case, both the master and slave databases have the same name, 
>>>> AND the same host. The only difference is the port number.
>>>>
>>>> When I start slon, it seems to try to figure out the node number 
>>>> from the combination of dbname+host, but it seems to ignore the port 
>>>> number, so it arbitrarily will get node 1 or node 2.
>>>>
>>>> (In case you're wondering, the second database is actually on a 
>>>> different machine, accessed via an encrypted ssh(1) tunnel that maps 
>>>> port 5433 on the local machine to port 5432 on the remote machine.  
>>>> So the two databases are accessed by the same name and same host, 
>>>> with only the port to distinguish them.)
>>>>
>>>> Is my analysis correct?  How does the slon daemon figure out which 
>>>> node number it is working on?  There doesn't seem to be any way to 
>>>> tell it.
>>>
>>> It calls the getlocalnodeid(name) function on the DB specified in the 
>>> conninfo when it's started.
>>>
>>> How are you providing the conninfo to the slon daemons?
>>
>> The same string I used to create the nodes, for example, here are the 
>> master and slave, respectively:
>>
>> Node 1 (master): slon -d 1 my_db_cluster 'dbname=my_db host=au 
>> user=postgres port=5433'
>> Node 2 (slave):  slon -d 1 my_db_cluster 'dbname=my_db host=au 
>> user=postgres'
>>
>> I also tried to make up a dummy hostname "au2", just to make them look 
>> different.  After deleting both nodes and rebuilding them with the 
>> different server names:
>>
>> Node 1 (master): slon -d 1 my_db_cluster 'dbname=my_db host=au2 
>> user=postgres port=5433'
>> Node 2 (slave):  slon -d 1 my_db_cluster 'dbname=my_db host=au 
>> user=postgres'
>>
>> but I still get the error.  This is a bit baffling.  I can't really 
>> see what's different than the other Slony configurations I've set up, 
>> other than that I've never used port numbers before.
> 
> I've replicated to a different PostgreSQL instance on the same host in 
> the past, so I would expect your setup to work fine.
> 
> Don't you have to connect to localhost port 5433 for the ssh port 
> forward to work or does host 'au' resolve to 127.0.0.1?

Yes, that's right, it resolves to 127.0.0.1, for the reason you mention.


> What does the output of this look like:
> 
> netstat -an | grep 543[23]

A whole bunch of ports in use (there are a lot of persistent mod_perl connections from Apache).

> and better yet, can you connect to both of them on the machine where the 
> slons will run like this:
> 
> psql -h au -p 5433 -U postgres my_db
> psql -h au -p 5432 -U postgres my_db
> 
> and if so, what's the output of:
> 
> select _my_db_cluster.getlocalnodeid('_my_db_cluster');
> 
> on both those?

The first one (the master) reports 1, and the second one (the slave) reports 2, as expected.

So I guess the question is, why does slon think it is on the wrong node?  If it connects and does the getlocalnodeid() as you say, then what makes it think that's the wrong node?  What tells it that node 2 is the wrong one?

Thanks,
Craig




More information about the Slony1-general mailing list