[Slony1-general] quadratic listener count growth

Wed Jan 26 16:55:08 PST 2005

David Parker wrote:

>We have slony working for basic backup from an "active" to a "standby" in a 2-node cluster, and it's great. I've been trying to leverage our slony infrastructure to support another kind of replication we need to do, but I'm beginning to think it's not a fit.
>
>Each cluster we have also needs to replicate some data from a central repository. I posted an earlier question on this because I need the active to forward some data to its standby, but the active also masters some data. Christopher Browne suggested defining separate slony clusters, which should work because there are distinct replication sets.
>
>After I spec'd out this topology I realized that I had been more or less in denial about the fact that our deployment over the next year or so will probably dictate up to 70 of these clusters needing to replicate from the central repository (we want to replicate out to these clusters as a caching/locality-of-reference thing).
>
>This obviously means that the listener connections will be out of control, and, I fear, pretty much kills the idea. BUT, I'm wondering if there is any way around having all of these listeners?
>
>For the set mastered on the central repository, NONE of the clusters will ever take over as master of that set - the central repository is itself a cluster. So in our environment, I'm not sure that every cluster (the active node) needs a listener to every other active node. As I understand it, these inter-subscriber connections are to make sure that SYNCs are up-to-date across the network, but if none of these can ever become master (dictated at the application level), can I get away without having the listeners for them?
>
>I would really like to be able to use slony for this part of our infrastructure, because it does exactly what we need, but I know we won't be able to support the number of database processes on each box that the default topology would require.
>
>Any comments/suggestions appreciated. Thanks.
>  
>
One of the "high priority items" (fairly urgent, and considered very 
important, at least here at Afilias) is the implementation of Log 
Shipping, which ought to be an answer to your problem.

The idea of Log Shipping is that you can have a node that, instead of 
syncing into a database, generates a series of files containing the SQL 
queries to do the updates.  That series of files may then be distributed 
to remote sites, and applied into a database, giving you a replica that 
is as up to date as the series of files.

One such "log shipping" node can feed as many replicas as you want it 
to.  There is no means of feeding anything back to the origin, but it 
can certainly scale.

So if what you want is to have (say) four reliable boxes at a central 
site that talk to one another, each of which could take over in a pinch, 
and then have a whole bunch of copies at different sites, log shipping 
would allow that to work.

It's not implemented yet, but our urgency level on it is rather high, so 
there oughta be something at least beta-like in the not too distant future.
-- 
<http://linuxdatabases.info/info/slony.html>