[Slony1-general] Slony ignoring cleanup_interval?

Wed Sep 24 19:46:48 PDT 2014

On 09/24/2014 01:00 PM, Steve Singer wrote:
> On 09/24/2014 12:29 PM, Christopher Browne wrote:
>> On Fri, Sep 19, 2014 at 1:50 PM, Kristopher <kristopherwilson at gmail.com
>> <mailto:kristopherwilson at gmail.com>> wrote:
>>
>>     I have the following setup in my conf file:
>>
>>     cleanup_interval="5 seconds"
>>
>> ...
>>
>>     However, it only actually runs cleanup about every 10 minutes (the
>>     default):
>>
>>
>> We have, in fact, three parameters controlling cleanup:
>>
>> a) cleanup_interval, expressed as a Postgres interval, that, according
>> to the docs, indicates:
>>
>> gettext_noop("A PostgreSQL value compatible with ::interval "
>> "which indicates what aging interval should be used "
>> "for deleting old events, and hence for purging sl_log_* tables."),
>>
>> (see src/slon/confoptions.c for that; I expect that there's a slon
>> option that will make it print out the documentation strings).
>>
>> b) SLON_CLEANUP_SLEEP, in src/slon/slon.h, which is hardcoded to 600,
>> indicating that every 600 seconds, the cleanup thread is called
>>
>> c) SLON_VACUUM_FREQUENCY, also in src/slon/slon.h, hardcoded to 3,
>> indicating how often cleanup thread should VACUUM tables.
>>
>> We haven't exposed SLON_CLEANUP_SLEEP as a configuration option, and, in
>> effect, that's a value you'd want to shorten a lot during this process.
>>
>> It wouldn't be a great deal of trouble to expose SLON_CLEANUP_SLEEP, and
>> it's probably somewhat handy to do so, particularly for situations such
>> as what you describe, where we want to avidly empty out sl_log_*.
>>
>> I'll see about coming up with a patch, with a view to applying this to
>> the various major releases.
>>
>
> My preference would be that we have 1 parameter in the config for
> controlling how often the cleanup thread does it's stuff.
>
> Ie make SLON_CLEANUP_SLEEP be controlled by the existing
> cleanup_interval field in the config.   I don't understand why/when
> someone would want these two values to be different.

I think we introduced the cleanup_interval (very easily misunderstood as 
something different from what it actually does) as a safeguard against 
race conditions, where we feared to remove replication log too soon.

Unless we still fear such race condition, we should get rid of that 
parameter entirely and rely on "confirmed by everyone is obsolete data" 
and just purge it.

The thing that really matters in this context is the cleanup sleep time, 
which determines how often the cleanup is actually done. Calling it 
often enough compared to checkpoints can actually lead to a situation 
where sl_log heap and index data never gets written to disk. And that is 
a goal well worth aiming for.

Jan

-- 
Jan Wieck
Senior Software Engineer
http://slony.info