Sat Feb 23 08:30:28 PST 2008
- Previous message: [Slony1-general] STILL can't migrate a node.
- Next message: [Slony1-general] STILL can't migrate a node.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 2/23/2008 12:20 AM, Craig James wrote: > A little more info on this problem... > > Craig James wrote: >> I'm trying to migrate a node for the second time, and no luck. Last >> time I tried it, it just got stuck, and due to lack of time, I didn't >> investigate. >> >> This time I watched -- it got stuck again, doing some sort of huge >> SELECT statement. I was under the impression that migrating a node was >> a fairly simple operation that should happen in a short time (less than >> a minute?) even for large databases. >> >> I waited 10 minutes, during which the entire system was completely >> locked up (no other process could access the database), and our web site >> was offline. I finally had to kill all of the slon daemons and kill >> Postgres to get our site back on the air, then run the node-unlock >> command to get Slony back in shape. >> >> This system appears to otherwise be working well. I can insert, update >> and delete records, and they're copied to the slave node immediately. >> >> What's up? Am I just too impatient? > > I tried it again, after vacuuming the slony tables that are subject to bloat. This time I shut everything off, started the migration of the master to node 2, and waited for 35 minutes, but the SELECT never finished. vmstat showed massive I/O and CPU activity the whole time. What SELECT are you referring to? I don't see where in the MOVE SET you have to perform any SELECT. > Again, after I killed postgres, restarted, and unlocked the node, Slony went back to performing perfectly. Killing postgres is a bad idea. Stop that habit right now, before you physically corrupt any of your databases. Anyhow, apparently the LOCK SET part of the process succeeds. So what I now assume is that the WAIT FOR EVENT never finishes. First, you don't need a WAIT FOR EVENT between LOCK SET and MOVE SET. Both events are executed on the origin, so by the time the LOCK SET finishes, everything is ready for the MOVE. But what this indicates is that node 2 never confirms the LOCK SET. Can it be that you actually have a problem with the connection from node 2 to node 1? What is the content of the view sl_status on both nodes? If you want to speed up this communication in order to meet your Sat. noon deadline, I'll be available on IRC, channel #slony on freenode. Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin
- Previous message: [Slony1-general] STILL can't migrate a node.
- Next message: [Slony1-general] STILL can't migrate a node.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list