[Slony1-general] Odd slony problem

Sun Mar 13 19:32:12 PST 2005

At 20:00 12/03/2005 -0500, you wrote:
> > And as far as I understand, the Slon daemons only do the vacuum analyze in
> > their own schema so having multiple daemons running against the same
> > database should not be an issue correct?
>
>Ah, good; that gives a second source for the conflict.  Almost all of the
>tables are in the cluster-specific schema, with the exception of
>pg_catalog.pg_listener.  I'll bet that's the one...
>
> > Should I look at disabling the auto-vacuum that SLON does and just make it
> > part of my nightly routine?
>
>Nope.  Slony-I needs plenty more vacuuming than once a day, particularly
>if you get a lot of data to replicate.
>
>In particular, one of the most conspicuous bottlenecks that pops is that
>if pg_listener is not vacuumed frequently enough, replication slows down a
>lot.
>
>This is indeed pointing to the notion of doing something to diminish the
>likelihood of competing vacuums getting "in phase."

Ah, now things are much clearer

>One thing you can do to cut down on the problem is to have the different
>slons have different frequencies of vacuums.  This is controlled by the
>"-c" option to slon which [alas] isn't available 'til 1.1.
>The parameter in 1.0.5 is in slon.h:
>
>slon.h:#define SLON_CLEANUP_SLEEP                       600             /* 
>sleep 10 minutes between */
>
>It is used in cleanup_thread.c:
>
>cleanup_thread.c:       while (sched_wait_time(conn, SCHED_WAIT_SOCK_READ,
>SLON_CLEANUP_SLEEP * 1000) == SCHED_STATUS_OK)
>
>I think the slick idea would be to modify this to add in a random value
>probably one ranging between 60000 (60 seconds) and 160000 (160 seconds).
>Since two slons would likely have different values, they would usually
>stay out of phase, only going into phase once in a few dozen cleanups,
>which would be unlikely to "tickle" this issue.

I think I shall do a watchdog rather than play with the source code. Slon 
works wonderfully, and if the watchdog will do me until 1.1 then I will be 
happy. (Unless 1.1 is a loooong time off all of the sudden)

>This points me to making a couple of further changes to the vacuums:
>
>1.  Do them as separate transactions so that if one fails, the other work
>isn't wasted;
>
>2.  Try to detect your scenario and WARN instead of having a FATAL ERROR.
>
>This is "fair game" for 1.1; we're certainly to stabilize it for release,
>which means that new code otta come in Real Soon.  This seems like "low
>hanging fruit" that should be pretty easy to add.  I have been looking at
>the relevant code lately, so it's pretty familiar...

That would be great, thanks.

--
|Tass Chapman, CCNA | tass at kenderhome.com
| http://www.playsanctum.com/
--

"All my friends and I are crazy.  That's the only thing that keeps us
sane."