Tue Jun 1 16:24:05 PDT 2010
- Previous message: [Slony1-general] Replication Hung
- Next message: [Slony1-general] slony archive log monitoring questions.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all,
I have a two node slony cluster and I have the slon daemon on the
slave node to run with the -a command. I'm attempting to better
understand how the logs work for slony log shipping, and have noticed a
bit of "odd" behavior (perhaps just odd to me) that perhaps someone can
explain to me. It's not just the log shipping that I'd like to better
understand, but just how the slony slave and master communicate with
eachother and make sure that the slave is in sync with the master, and
receiving ALL data that it needs without missing any.
In each of the archive logs created by the slony slave with the -a
command, it looks something like this:
---------------------------------------
------------------------------------------------------------------
-- Slony-I log shipping archive
-- Node 1, Event 555
------------------------------------------------------------------
start transaction;
select "_slony".archiveTracking_offline('22', '2010-06-01 12:53:11.214886');
-- end of log archiving header
------------------------------------------------------------------
-- start of Slony-I data
------------------------------------------------------------------
select "_slony".sequenceSetValue_offline(1,'38');
------------------------------------------------------------------
-- End Of Archive Log
------------------------------------------------------------------
commit;
vacuum analyze "_slony ".sl_archive_tracking;
---------------------------------------
Obviously, this log has no actual replication data in it, however there
are two main things that I notice in this. First is the Event number,
second is the number which is in the select statement for
archiveTracking_offline (which is the same number as what is in the name
of this particular file, slony1_log_2_00000000000000000022.sql).
For the slony log shipping to work, I understand that each of the log
files are required in order, but I've noticed that looking through the
actual files, the Events themselves sometimes skip a number, or several.
Example,
---------------------------------------
# cat /path/to/slon_archive_logs/* | grep Event
-- Node 1, Event 554
-- Node 1, Event 555
-- Node 1, Event 556
-- Node 1, Event 558
-- Node 1, Event 559
---------------------------------------
Looking at this, event 557 is missing, however the numbering of the
archive logs is not broken, each log appears with the expected numbering
in the name. This happens often, and I originally thought that this was
due to the log that contains the previous event (in this case 556) would
contain the data for both events, and 557 would simply not appear. I've
looked through every single archive log, and the event does not appear
in any of them, nor does it appear later down the road.
I then thought that this event could be an event that isn't a sync, but
rather perhaps something else that wouldn't make it into these archive
logs (this might still bet he case). However, shortly after seeing that
this event is missing, i took a look at the sl_event table and saw that
the event is indeed there and is indeed a SYNC:
---------------------------------------
postgres=# select ev_seqno, ev_type from _slony.sl_event where ev_origin
= '1';
ev_seqno | ev_type
----------+---------
554 | SYNC
555 | SYNC
556 | SYNC
557 | SYNC
558 | SYNC
559 | SYNC
---------------------------------------
Basically what I want to do is write up a little script that will alert
me via email if something goes wonky with slony replication. I had an
event recently where data was missing from the slony slave, and all the
searching I could do came up with showed that the data was never
replicated, but slony never reported errors (this is in a recent email
to this mailing list). It would be easy to raise an alert that says
"woah event 557 was not found", however if it is normal behavior for
events to be missing like this, then that wouldn't be a good approach to
take.
I've seen there are a couple of slony monitoring tools, and I'll be
checking them out to see if they offer anything that I could use. But
any other suggestions, or even some clarification as to how some of this
works would be greatly appreciated. If the situation happens again where
my slony slave is missing data, I'd like a bit of logs to review and see
when something may have gone wrong, even if i have to generate these
logs myself with some sort of monitoring.
Thanks in advance,
Brian F
- Previous message: [Slony1-general] Replication Hung
- Next message: [Slony1-general] slony archive log monitoring questions.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list