Mon Apr 21 09:24:06 PDT 2008
- Previous message: [Slony1-general] Slony Replication in wide-area applications?
- Next message: [Slony1-general] Network Partitioning (was: Slony Replication in wide-area applications?)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Vivek Khera <vivek at khera.org> writes: > On Apr 21, 2008, at 9:05 AM, Bill Moran wrote: > >> I don't understand the question. What do you mean by "network >> partition" and how does this represent a failure scenario? > > Normally all hosts can see every other one. When your network is > partitioned, you end up with at least two subsets which can still see > every other host within that subset, bot none of the hosts in the > other subset(s). Slony-I can work fine with network configurations where there are such partitions where you have clusters[1] of nodes in LANs, where there is limited communications across a WAN. Consider: - Nodes 1-3 are in a LAN at Data Centre A - Nodes 4-6 are in a LAN at Data Centre B There are constrictions... - 1-3 can easily "talk amongst themselves." - Likewise, 4-6 can easily "talk amongst themselves." - However, we pick #3 and #4 as being the only nodes that are allowed to talk with one another across the WAN There are configurations you cannot create, in such a case: - You cannot have any configuration where nodes 5 or 6 subscribe directly to 1-3; they *MUST* go thru node 4 - Likewise, you cannot have any configuration where nodes 1 or 2 subscribe directly to 4-6; they *MUST* go thru node 3 I don't think network partitioning represents a particularly compelling problem. The essential WAN problems are three-fold: - If the WAN is flakey, a frequently observed problem is that connections will have failed, but the database connection doesn't actually get dropped by the DB server until a TCP/IP timeout takes place, which often takes 2-3 hours. During that time, attempts for a slon to reconnect will be rebuffed because the old connection is still there, even though there is no way for it to be used. This is somewhat of a moral equivalent to a zombie process; the old DB connection is unusable, but doesn't know it's dead. There's probably some way to automate cleaning the old connection out, though it's not something Slony-I could do itself, and I haven't tried constructing such a cleanup process. - If the WAN is sufficiently flakey, it may be problematic to keep a transaction running across the WAN for long enough to get a subscription going. - If the WAN is sufficiently flakey, then you may not have enough network bandwidth to keep a replica fed. (Those represent three problems that are different from one another in their essences...) Footnotes: [1] In this case, "cluster" isn't in the Slony-I sense, but rather simply "a bunch of nodes." -- (format nil "~S@~S" "cbbrowne" "cbbrowne.com") http://linuxfinances.info/info/sgml.html "I think you ought to know I'm feeling very depressed" -- Marvin the Paranoid Android
- Previous message: [Slony1-general] Slony Replication in wide-area applications?
- Next message: [Slony1-general] Network Partitioning (was: Slony Replication in wide-area applications?)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list