Wed May 12 09:44:52 PDT 2010
- Previous message: [Slony1-hackers] [Slony1-general] An old event not confirmed: A possible bug?
- Next message: [Slony1-hackers] [Slony1-general] An old event not confirmed: A possible bug?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, May 12, 2010 at 10:56 AM, Jan Wieck <JanWieck at yahoo.com> wrote: > On 5/12/2010 10:31 AM, Gurjeet Singh wrote: > >> Hi All, >> >> I have two Slony test beds which show the exact same symptoms! >> >> select * from sl_event order by ev_seqno; >> >> ev_origin | ev_seqno | ev_timestamp | ev_snapshot >> | ev_type | >> >> -----------+------------+----------------------------+----------------------------+---------+- >> 2 | 5000000002 | 2010-04-30 08:32:38.622928 | 458:458: >> | SYNC | >> 1 | 5000525721 | 2010-05-12 13:30:22.79626 | 72685915:72685915: >> | SYNC | >> 1 | 5000525722 | 2010-05-12 13:30:24.800943 | 72686139:72686139: >> | SYNC | >> 1 | 5000525723 | 2010-05-12 13:30:26.804862 | 72686224:72686224: >> | SYNC | >> ... >> >> > Slony always keeps at least the last event per origin around. Otherwise the > view sl_status would not work. > > What should worry you is that there are no newer SYNC events from node 2 > available. Slony does create a sporadic SYNC every now and then even if > there is no activity or the node isn't an origin anyway. > > Is it possible that node 2's clock is way off? > # ssh root at 10.32.169.215 date; ssh root at 10.32.169.216 date Wed May 12 16:38:20 UTC 2010 Wed May 12 16:38:20 UTC 2010 Above the difference of times on the two nodes; 215 has the origin and 216 has the subscriber. They seem to be perfectly in sync. I think I forgot to paste the test_slony_state.pl output before. This is waht raised the concern <snip> Node: 2 Confirmations not propagating from 2 to 1 ================================================ Confirmations not propagating quickly in sl_confirm - For origin node 2, receiver node 1, earliest propagated confirmation has age 12 days > 00:30:00 Are slons running for both nodes? Could listen paths be missing so that confirmations are not propagating? Node: 2 Events not propagating to node 2 ================================================ Events not propagating quickly in sl_event - For origin node 2, earliest propagated event of age 12 days 00:01:00 > 00:30:00 Are slons running for both nodes? Could listen paths be missing so that events are not propagating? </snip> And the path and listen configs: system.db=# select * from sl_path; pa_server | pa_client | pa_conninfo | pa_connretry -----------+-----------+---------------------------------------------------+-------------- 2 | 1 | dbname=system.db host=10.32.169.216 user=postgres | 10 1 | 2 | dbname=system.db host=10.32.169.215 user=postgres | 10 (2 rows) system.db=# select * from sl_listen ; li_origin | li_provider | li_receiver -----------+-------------+------------- 2 | 2 | 1 1 | 1 | 2 (2 rows) Thanks and best regards, > > > Jan > > The reason I think this _might_ be a bug is that on both clusters, slave >> node's sl_event has the exact same record for ev_seqno=5000000002 except for >> the timestamp; same origin, and same snapshot! >> >> The head of sl_confirm has: >> >> select * from sl_confirm order by con_seqno; >> >> con_origin | con_received | con_seqno | con_timestamp >> ------------+--------------+------------+---------------------------- >> 2 | 1 | 5000000002 | 2010-04-30 08:32:53.974021 >> 1 | 2 | 5000527075 | 2010-05-12 14:15:41.192279 >> 1 | 2 | 5000527076 | 2010-05-12 14:15:43.193607 >> 1 | 2 | 5000527077 | 2010-05-12 14:15:45.196291 >> 1 | 2 | 5000527078 | 2010-05-12 14:15:47.197005 >> ... >> >> Can someone comment on the health of the cluster? All events, except for >> that on, are being confirmed and purged from the system regularly, so my >> assumption is that the cluster is healthy and that the slave is in sync with >> the master. >> >> Thanks in advance. >> -- >> gurjeet.singh >> @ EnterpriseDB - The Enterprise Postgres Company >> http://www.enterprisedb.com >> >> singh.gurjeet@{ gmail | yahoo }.com >> Twitter/Skype: singh_gurjeet >> >> Mail sent from my BlackLaptop device >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Slony1-general mailing list >> Slony1-general at lists.slony.info >> http://lists.slony.info/mailman/listinfo/slony1-general >> > > > -- > Anyone who trades liberty for security deserves neither > liberty nor security. -- Benjamin Franklin > -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.enterprisedb.com singh.gurjeet@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.slony.info/pipermail/slony1-hackers/attachments/20100512/ba6812db/attachment-0001.htm
- Previous message: [Slony1-hackers] [Slony1-general] An old event not confirmed: A possible bug?
- Next message: [Slony1-hackers] [Slony1-general] An old event not confirmed: A possible bug?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-hackers mailing list