[Slony1-general] Understanding Slony Cleanup

Thu Nov 22 10:13:18 PST 2007

I'm trying to understand what goes on during cleanupThread_main()

The code does this, in order:

1. deletes all outstanding log rows from both log tables

2. (eventually) truncates the log table

3. (allegedly) vacuums the log table 

I have a few questions on the above for 1.2.x

- SQL function logswitch_finish() says it is called after cleanup thread
has vacuumed both log tables. The code in cleanupThread_main() seems to
avoid the vacuum. Not sure whether the SQL comment is wrong, or are we
saying we allow autovacuum to do this for us, or we don't do it at all? 

- Why do we DELETE log table rows at all? We're doing it out-of-line so
it clearly isn't a necessary step for correctness (or is it?). At the
end of it all we TRUNCATE them anyway, so what was the point of all that
deletion? Or alternatively, why do we truncate and log switch at all?

- My understanding of the flip-flop design with 2 log tables was that it
would allow us to avoid VACUUM entirely, yet this doesn't seem to be the
case in 1.2. What purpose does the second log table serve?

- If we do have to DELETE, why do we do this to both log tables? Surely
changes will only be found in one? Or are we assuming that the query
will do a fast indexscan and return quickly, so why bother trying to
avoid it?

- We rely on vacuum_delay having been set elsewhere. If we do have to do
VACUUMs, then can we/should we force a non-zero vacuum_delay for the
cleanup thread?

- We ANALYZE the log tables when they are empty following the TRUNCATE.
Is that done deliberately for some reason? At that point the data values
are not available so it means later planning of SQL against the log
tables is going to be a little strange. Should weavoid re-ANALYZEing the
log table when we have just TRUNCATED it?

Anyone shed any light on these things? Thanks.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com