Christopher Browne cbbrowne
Wed Jul 27 18:02:31 PDT 2005
Andreas Pflug wrote:

> Christopher Browne wrote:
>
>> "Andy" <frum at ar-sd.net> writes:
>>
>>  
>>
>>> Thank you all for replying.
>>>
>>> At least I have a better image about my "problem"... it really becomes
>>> a big one :(.
>>>
>>> I looked also over other replication systems, but the problems still
>>> remains open.
>>>   
>>
>>
>> Yes, this indicates that there is room for someone to implement an
>> asynchronous multimaster replication system with conflict resolution.
>>  
>>
> Better than conflict resolution is conflict avoidance. This is
> dependent on the application, and might work if separate number/id
> spaces for all objects may be defined. Situations as described will
> always need adapted apps and organizational prerequisites. Multimaster
> replication with conflict resolution is much of an illusion (... for
> any database system. Difference to pgsql is other rdbms' marketing
> people can lie better :-)

Well, yes, you certainly want to avoid conflicts altogether if you can.

Implementing some form of "distributed sequence" where each node uses
separate ranges is a reasonable solution to those conflicts that would
arise as a result of "internal interlinkages".

That way, you might, instead of using the default "nextval('my_seq')" to
fill in 'linkage' fields, use the default "dist_nextval('my_dist_seq')",
where you have some extra functionality hiding behind the scenes so that
the "distributed sequence" called my_dist_seq assigns different value
ranges on one host as compared to another.

Based on what I have heard, I think Slony-II could benefit from that;
the idea is hardly confined to being useful in one place.

However, there will be conflicts that you cannot avoid.

For instance, there is the conflict that someone on node #1 draws down
inventory of product ABC by 20 units, whilst, concurrently, someone on
node #2 draws down inventory on the same product by 30 units, when the
inventory started at 40 units, and there is a rule that inventory cannot
fall below zero.  Separate "sequence spaces" does nothing to help with that.

Essentially equivalent is the conflict that someone on node #1 updates
the phone number of customer XYZ whilst concurrently someone else on
node 2 does a different update to the phone number for the same customer.

Those are two examples of application-driven conflicts.

Some clever modifications to sequence handling may allow you to avoid
having the replication system itself introduce conflicts, which is well
and good; it does nothing to resolve the sorts of "application-driven"
conflicts described above.

If I understand things correctly, one of Oracle's multimaster
replication systems will queue up updates and stop the queue when it
finds conflicts.  The DBA then has to resolve the conflicts.  (They may
have some way to automate parts of it; I'm not sure.)

SAP R/3 had an interesting way of handling this sort of thing; you could
group its "BDC" updates together and try to run a batch of transactions
through.  Those transactions that failed would be left; you could try to
walk them through manually, step by step.  That was pretty neat as it
would allow someone with application knowledge to process errors rather
than forcing it on the "techies" who might not be familiar with the
business transactions involved.  But that's a lightyear away from being
something we could add to a replication system...


More information about the Slony1-general mailing list