[Slony1-general] Feature Idea: improve performance when a large sl_log backlog exists

Thu Nov 25 08:50:54 PST 2010

On 11/23/2010 12:49 PM, Steve Singer wrote:
> On 10-11-23 11:53 AM, Christopher Browne wrote:
>>
>>  I'm not sure that we gain much by splitting the logs into a bunch of
>>  pieces for that case.
>>
>>  It's still the same huge backlog, and until it gets worked down, it's
>>  bloated, period.
>
> My concern is with the performance on the master (and any other slaves)
> caused by the bloat not the slave being populated.
>
> Right now sl_log_1 can grow to be gigabytes on the master.  Each time
> the sync thread on the master needs to select from sl_log_1 to generate
> a sync it queries a table that is gigabytes in size.  Similarly when any
> other slaves need to select from sl_log_1 against the master they need
> to query a table that is gigabytes.

Since when does the SYNC generation SELECT from sl_log_x? It should only 
INSERT into sl_event and be done with it. Only the direct subscribers 
should query sl_log_x on the master.

> With having a  growable pool of sl_log tables my thinking is that once
> all transactions using sl_log_x have committed we can do a final sync
> against sl_log_x and then sync thread no longer needs to access it.
> Similarly other subscribers that querying the the master could be made
> smart enough to not need to look at sl_log segments for which they have
> already received all of the data.
>
>
> This way the bloat caused by backlog won't effect the performance of
> normal (up-to-date) syncs.

But as you noticed in another mail already, this would complicate the 
log selection union. And in more ways than may be obvious.

Jan

-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin