Wed Aug 20 08:24:09 PDT 2008
- Previous message: [Slony1-general] Replication node suddenly lagging, CPU bound postmaster
- Next message: [Slony1-general] OID # does not exist
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wednesday 20 August 2008, Benjamin Pineau <bpineau at elma.fr> wrote: > Hi everyone. > > I have a replicating node that suddenly started to lag, on a 4 nodes > Slony cluster that worked well for months. This node is powerful enough > (ie. older, slower machines on the cluster achieve to keep up well). > Network and block devices are mostly idling (with regard to > interrupts/second and throughput). Strangely, the replication on this > node seems CPU bound by the postmaster process doing the actual > inserts/updates for slon (this postmaster process is stuck at 99% CPU > usage since the beginning of the problem). > Neither Slony (at "slon -d2" level) nor PostgreSQL did log any warning or > error message, and the replication did not stopped on this node (it makes > progresses, but too slowly to keep up, so it's now 3 days behind master). > > Any clue? Look at pg_stat_activity for the slon process on the slave - you'll probably see a bunch of updates or deletes that look like they should be finishing fast, but aren't. When this happens here it's usually because the target table that's causing problems needs an ANALYZE (for us it usually happens on the first of the month when new month's data starts showing up and the planner loses its mind and stops using the primary key to find update/delete rows). -- Alan
- Previous message: [Slony1-general] Replication node suddenly lagging, CPU bound postmaster
- Next message: [Slony1-general] OID # does not exist
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list