Fri Dec 23 16:05:39 PST 2005
- Previous message: [Slony1-general] Bug in RebuildListenEntries (ID: 1485)
- Next message: [Slony1-general] Bug in RebuildListenEntries (ID: 1485)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 12/23/2005 4:57 AM, Florian G. Pflug wrote:
> Jan Wieck wrote:
>> On 12/21/2005 8:18 PM, Florian G. Pflug wrote:
>>
>>> Florian G. Pflug wrote:
>>> <snipped my own mail>
>>>
>>> Can anyone confirm that this is actually a bug? I pretty sure
>>> (Did multiple setups of my cluster, and the problem persisted -
>>> I used the altperl scripts for setting up the cluster, so I
>>> see no way I could have causes this).
>>>
>>> If it's really I bug, I would at least be worth a note in
>>> the docs or in the 1.1.5 release notes - I took me hours to
>>> nail down the problem, and it wasn't fun, so preventing
>>> others from having to do the same would be a good thing.
>>
>> Rebuild listen entries is indeed broken. This is a show stopper for
>> 1.1.5 ... I am working at it.
>
> Is there a reason for not generating all "sensible" sl_listen entries?
> I didn't find any documentation on the performance overhead a
> sl_listen entry causes.
Exactly the "sensible" part of that all is important. Problems arise
when a node receives an event from any set-origin, which has not yet
been processed by its data provider for that set. For example
1 -> 2
3
1 being origin, 2 is subscriber, 3 is a new node not subscribed yet. 3
has paths for 1 and 2, so naturally it would listen on each of them for
their events. If we now subscribe 3 as a cascaded node with 2 as its
data provider, the ENABLE_SUBSCRIPTION event that will follow from node
1, on which node 3 will start copy_set, must be received by 3 from 2.
That is the only way that 2 at the moment where 3 starts to copy data
actually has data itself. It could still be busy with it's own copy_set,
meaning that not only the data in the tables is missing, the tables
themself aren't in sl_table either yet.
And to spice this up a little more, reading the events is done async in
the remote_listen thread. They are queued and the remote_worker thread
will process them from the queue. At the moment where node 3 gets the
SUBSCRIBE_SET event, it will have a lot of stuff already queued, so it
better restart ASAP to throw that away and listen again, this time for
all 1-events on 2.
Jan
>
> With "sensible" I mean: Telling node X via sl_listen to ask neighbour-nodes
> (Those for which a sl_path entry exists) for events from all other nodes,
> apart from those for which the events must have travelled via node X to
> reach the neighbour of X in question.
>
> I tried writing an algorithm to do that, but it turned out that isn't quite
> as easy as I initially believed, because all "iterative" algorithms
> I could think off (Which were all based basically on the idea, that
> if X receives events from Y, and Y from Z, then X can receive events from Z
> via Y) failed because there is not enough information in sl_listen to figure
> out if Y already needs X receive events from Z).
>
> greetings, Florian Pflug
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck at Yahoo.com #
- Previous message: [Slony1-general] Bug in RebuildListenEntries (ID: 1485)
- Next message: [Slony1-general] Bug in RebuildListenEntries (ID: 1485)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list