[Slony1-general] Quick 'performance' question ...

Mon Apr 11 14:19:50 PDT 2005

> Does anyone have any #s on how fast Slony works?
>
> On all my current uses, there aren't a heavy number of
> 'insert/update/delete's happening, so of course, it never falls behind ...
>
> If one were to have a 'write master' with a bunch of 'read subscribers', I
> imagine there is a point where the writes to the subscribers would start
> to get delayed, right?  Has anyone hit such a limit?

I haven't seen that limit; I would expect it to be surprisingly high, and
I know of a useful workaround.

I'd expect the "point" to be surprisingly high because the data being sent
over to the subscribers would remain in the shared memory cache and so be
_extremely_ accessible.

The totally obvious workaround is to use some form of "star" configuration
for the subscribers.  Rather than having them all hit the origin, I'd have
just a few hit the origin, and then have a (perhaps large) set of cascaded
subscribers.

Thus, if I wanted 125 subscribers (a silly number, to be sure), I would
set up 5 of them that subscribe to the origin.  They would then each be
the providers for another 25, each of them "feeding" 5 nodes.  As for the
other 95, they'd be subscribing to one of those second level nodes.  (And
note that when updates get pushed to the 2nd and 3rd level providers, the
relevant data will be in shared memory cache so that feeding the lower
level subscribers should take place with gratifying efficiency...)

We use much this approach (albeit with rather fewer than 126 nodes!) to
minimize the impact of replication work on our origin nodes.  The "master"
is the 'transaction monster'; as much as possible, queries hit other
nodes.

All that being said, having 126 nodes would lead to an enormous traffic in
event confirmations, which would be plenty expensive.

Furthermore, all the rest being said, it's pretty easy to get Slony-I to
"bog down;" you just need to do heavy updates that get applied "en masse".
 As with:

  update some_replicated_table set updated_on = now() where id in
     (select id from some_replicated_table limit 5000);

That one query creates 5000 updates, which, from Slony-I's perspective,
are 5000 _individual_ updates.  5000 entries in sl_log_1, and such.

The number of nodes is fairly much irrelevant to the matter; the
"cascading methodology" I pointed out would work (as well as it gets!) to
push the updates downstream.  And the "bogging down" takes place quite
readily with there being just one origin and one subscriber.
-- 
<http://linuxdatabases.info/info/slony.html>