Thu Jul 30 04:26:56 PDT 2009
- Previous message: [Slony1-general] sl_status incorrectly reports long event lag
- Next message: [Slony1-general] Helpful admin Tools referral : scripts manager
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Sorry, but it seems I was wrong when I said that the problem was fixed. I assumed that since the event lag returned to zero when new paths where stored per Christopher's direction, the problem was corrected. However, the problem persists. When I restart a slave database (in the following example node 2), replication works fine (at least as far as can be immediately observed), but sl_status shows: 1;3;38689;"2009-07-30 12:11:51.796";38688;"2009-07-30 12:12:02.428316";"2009-07-30 12:11:41.859";1;"00:00:14.015" 1;2;38689;"2009-07-30 12:11:51.796";38605;"2009-07-30 11:52:35.119048";"2009-07-30 11:58:05.734";84;"00:13:50.14" Node 2's event lag grows and grows (until the slon service is restarted, at which time it returns to zero, just as before). When I run test_slony_state-dbi.pl while the event lag continues to grow, it outputs the following: peter at peter-development-machine:~/slony1-2.0.2/tools> ./test_slony_state-dbi.pl --host=10.0.0.80 --database=lustre --cluster=lustre_cluster --user=postgres --password=my_password DSN: dbi:Pg:dbname=lustre;host=10.0.0.80;user=postgres;password=my_password; =========================== Rummage for DSNs ============================= Query: select p.pa_server, p.pa_conninfo from "_lustre_cluster".sl_path p -- where exists (select * from "_lustre_cluster".sl_subscribe s where -- (s.sub_provider = p.pa_server or s.sub_receiver = p.pa_server) and -- sub_active = 't') group by pa_server, pa_conninfo; Tests for node 1 - DSN = dbi:Pg:dbname=lustre host=10.0.0.80 user=postgres password=my_password ======================================== pg_listener info: Pages: 0 Tuples: 0 Size Tests ================================================ sl_log_1 0 0.000000 sl_log_2 0 0.000000 sl_seqlog 0 0.000000 Listen Path Analysis =================================================== No problems found with sl_listen -------------------------------------------------------------------------------- Summary of event info Origin Min SYNC Max SYNC Min SYNC Age Max SYNC Age ================================================================================ 1 38605 38699 00:00:00 00:15:00 0 2 20 20 01:08:00 01:08:00 1 3 30 30 01:02:00 01:02:00 1 --------------------------------------------------------------------------------- Summary of sl_confirm aging Origin Receiver Min SYNC Max SYNC Age of latest SYNC Age of eldest SYNC ================================================================================= 1 2 38605 38605 00:20:00 00:20:00 0 1 3 38627 38698 00:00:00 00:11:00 0 2 1 20 20 01:03:00 01:03:00 1 2 3 20 20 01:02:00 01:02:00 1 3 1 30 30 01:02:00 01:02:00 1 3 2 30 30 01:08:00 01:08:00 1 ------------------------------------------------------------------------------ Listing of old open connections on node 1 Database PID User Query Age Query ================================================================================ Tests for node 3 - DSN = dbi:Pg:dbname=lustre_slave host=10.0.0.82 user=postgres password=my_password ======================================== pg_listener info: Pages: 0 Tuples: 0 Size Tests ================================================ sl_log_1 0 0.000000 sl_log_2 0 0.000000 sl_seqlog 0 0.000000 Listen Path Analysis =================================================== No problems found with sl_listen -------------------------------------------------------------------------------- Summary of event info Origin Min SYNC Max SYNC Min SYNC Age Max SYNC Age ================================================================================ 1 38605 38699 00:00:00 00:15:00 0 2 20 20 01:08:00 01:08:00 1 3 30 30 01:02:00 01:02:00 1 --------------------------------------------------------------------------------- Summary of sl_confirm aging Origin Receiver Min SYNC Max SYNC Age of latest SYNC Age of eldest SYNC ================================================================================= 1 2 38605 38605 00:21:00 00:21:00 0 1 3 38629 38699 00:00:00 00:11:00 0 2 1 20 20 01:03:00 01:03:00 1 2 3 20 20 01:03:00 01:03:00 1 3 1 30 30 01:03:00 01:03:00 1 3 2 30 30 01:08:00 01:08:00 1 ------------------------------------------------------------------------------ Listing of old open connections on node 3 Database PID User Query Age Query ================================================================================ Tests for node 2 - DSN = dbi:Pg:dbname=lustre_slave host=10.0.0.81 user=postgres password=my_password ======================================== pg_listener info: Pages: 0 Tuples: 0 Size Tests ================================================ sl_log_1 0 0.000000 sl_log_2 0 0.000000 sl_seqlog 0 0.000000 Listen Path Analysis =================================================== No problems found with sl_listen -------------------------------------------------------------------------------- Summary of event info Origin Min SYNC Max SYNC Min SYNC Age Max SYNC Age ================================================================================ 1 38573 38699 -00:05:00 00:15:00 0 2 20 21 00:15:00 01:03:00 0 3 30 30 00:57:00 00:57:00 1 --------------------------------------------------------------------------------- Summary of sl_confirm aging Origin Receiver Min SYNC Max SYNC Age of latest SYNC Age of eldest SYNC ================================================================================= 1 2 38607 38699 00:00:00 00:15:00 0 1 3 38573 38698 -00:05:00 00:15:00 0 2 1 20 20 00:57:00 00:57:00 1 2 3 20 20 00:57:00 00:57:00 1 3 1 30 30 00:57:00 00:57:00 1 3 2 30 30 01:02:00 01:02:00 1 ------------------------------------------------------------------------------ Listing of old open connections on node 2 Database PID User Query Age Query ================================================================================ peter at peter-development-machine:~/slony1-2.0.2/tools> Any further help you could offer is greatly appreciated, Regards, Peter Geoghegan
- Previous message: [Slony1-general] sl_status incorrectly reports long event lag
- Next message: [Slony1-general] Helpful admin Tools referral : scripts manager
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list