Laurent Laborde kerdezixe at gmail.com
Tue Feb 10 04:51:35 PST 2009
Friendly greetings !


We have one master and many slave.
All slave have forward=true;

Slony : 1.2.15
Postgresql : 8.3.5

For some unknown reason one of the slave have old unconfirmed events.

master=# select * from _replication.sl_confirm order by con_timestamp limit 100;


 con_origin | con_received | con_seqno |       con_timestamp
------------+--------------+-----------+----------------------------
         17 |           23 |    121020 | 2009-02-03 12:05:21.170571
         17 |           21 |    121021 | 2009-02-03 12:05:31.172798
         17 |           24 |    121021 | 2009-02-03 12:05:31.220636
         17 |           25 |    121021 | 2009-02-03 12:05:31.398649
         17 |           12 |    121021 | 2009-02-03 12:05:31.462134
         17 |            2 |    121021 | 2009-02-03 12:05:31.53767
         17 |           27 |    121021 | 2009-02-03 12:05:33.104655
         17 |           26 |    121021 | 2009-02-03 12:05:33.191155
         17 |           16 |    121021 | 2009-02-03 12:05:35.335947
         17 |            5 |    121021 | 2009-02-03 12:05:46.437873
         16 |           25 |    851571 | 2009-02-10 13:33:38.684834
         16 |           24 |    851571 | 2009-02-10 13:33:38.754192
         16 |           22 |    851571 | 2009-02-10 13:33:38.780222
         16 |            2 |    851571 | 2009-02-10 13:33:38.78589
         16 |           12 |    851571 | 2009-02-10 13:33:38.786113
         16 |           17 |    851571 | 2009-02-10 13:33:38.787333
         16 |           27 |    851571 | 2009-02-10 13:33:38.820739
         16 |           26 |    851571 | 2009-02-10 13:33:38.827202
         16 |            5 |    851571 | 2009-02-10 13:33:38.83443
         16 |           23 |    851571 | 2009-02-10 13:33:38.896735
         16 |           21 |    851571 | 2009-02-10 13:33:38.90098

As you can see, there is some old unconfirmed event at 2009-02-03
12:05, and only from 17.
(17 is one of the slave)

i restarted slony, everywhere. restarted slony on the node 17 too.
Tried a cleanupEvent (which is done when a slond restart anyway)
Still here ...

i don't have any old event in sl_event, and the replication doesn't lag.

but i receive those messages (check_slony_state.pl in crontab) :

Node: 17 Confirmations not propagating from 17 to 12
================================================
Confirmations not propagating quickly in sl_confirm -

For origin node 17, receiver node 12, earliest propagated
confirmation has age 6 days 22:06:00 > 00:30:00

Are slons running for both nodes?

Could listen paths be missing so that confirmations are not propagating?

[...]

What can i do to solve this problem ?

Thank you :)

-- 
F4FQM
Kerunix Flan
Laurent Laborde


More information about the Slony1-general mailing list