Jacques Caron jc at oxado.com
Wed Apr 2 08:30:30 PDT 2008
Hi Christopher,

At 17:37 31/03/2008, Christopher Browne wrote:
>The introduction of these partial indices dates back to the following
>discussion thread on pgsql-hackers:
>http://archives.postgresql.org/pgsql-hackers/2006-06/msg01516.php
>
>At one point, we had this as a second index:
>  create index sl_log_1_idx2 on @NAMESPACE at .sl_log_1
>         (log_xid @NAMESPACE at .xxid_ops);
>
>Unfortunately, it was apparently leading to problems in that data
>sourced from different origins might have xxid values of varying sign.
>
>So, in lieu of that, I introduced code that would generate a
>per-origin partial index, which would necessarily not suffer from the
>rollover problem that sl_log_1_idx2 would run into.
>
>It is quite likely that the partial indices will be preferred, as the
>first column in sl_log_1_idx1 doesn't discriminate much.

OK, so now I understand why there are partial indexes, but these 
partial indexes are redundant with quite a bit of the full index. The 
next logical step would be to remove the full index, however it seems 
there are some cases where it's still needed (i.e. queries with a 
where log_origin = node for which there is no partial index), at 
least some DELETEs (which my wild guess is that they actually always 
return 0 rows?).

So, possible options:
- add partial indexes for all possible origins (i.e. all nodes?) 
rather than just the ones that are actually origins? Those would most 
probably be completely empty most of the time.
- don't do any queries against nodes that are not really origins (but 
I guess this can become a bit complex in switchover/failover scenarios?)

The whole idea being of course to save a bit of I/O by avoiding the 
need to maintain duplicate indexes (always a good thing for 
heavily-updated tables when there's a bit of backlog).

Comments?

Jacques.



More information about the Slony1-general mailing list