Christopher Browne cbbrowne
Wed Mar 1 15:22:10 PST 2006
Rod Taylor wrote:

>>The one "grand challenge" you'll face is that getting the subscription
>>going, with 224GB of data, will take quite a while, which will leave
>>transactions open for quite a while.
>>    
>>
>
>It helps if you subscribe one table at a time and merge them into an
>existing set.
>
>So, Create set, add table to set, wait..., merge set.  Repeat for each
>table.
>  
>
I'd be inclined to wait 'til the end and merge them all, but that's just
me...

>This makes the largest transaction time the largest table, rather than
>the time it takes to copy everything across.
>  
>
Indeed.

>In fact, is there a reason that Slony doesn't do this by default?  Just
>change ADD TABLE to spit out the 3 step process in all circumstances
>using a set of temporary set IDs (sequence that wraps between 2^31 and
>2^32 or something).
>  
>
Hmm.  That's an interesting thought.

The trouble is that this only works properly in the 2 node case.

In effect, this temporarily takes the new table out of the planned set,
stows it in a new one, subscribes the new one, and then cleans up. 
Alas, if you're subscribing  the *third* node, this means you're
repeatedly taking tables out of a set and putting them back in.  I think
the semantics of it break down, at that point.

It's fine if this process is invisible to the other nodes, but, if you
are opening a new transaction to handle each table, that is manifestly
NOT the case; there *are* a multiplicity of transactions, and some SYNCs
may need to get through in the interim between processing tables.

Don't interpret this as pooh-poohing the idea entirely; there may be
solutions to the problems.  It's just that it's not as slick as it seems
at first, and there are problems to be solved before it could be usable. 

What thinking I'm doing is, alas, all pointing towards this being
unworkable :-(.  The other BIG trouble is that it eliminates the notion
that the set can be considered consistent as soon as it is available. 
For instance, if we set up a replica of one of our registries, as soon
as the SUBSCRIBE_SET event completes, you can, immediately, query it,
and expect to have results that are internally consistent with the state
of the "master" at some point in time.  Doing tables incrementally loses
that property, I think...



More information about the Slony1-general mailing list