Mon Mar 27 14:43:05 PST 2006
- Previous message: [Slony1-commit] By cbbrowne: test scripts to analyze cluster status had trouble,
- Next message: [Slony1-commit] By cbbrowne: Reorganized FAQ into multiple <qandadiv> divisions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Log Message: ----------- Update installation guide, indicating caveats surrounding whether you should use source or binaries Updated maintenance discussion, which also touches on monitoring Modified Files: -------------- slony1-engine/doc/adminguide: installation.sgml (r1.21 -> r1.22) maintenance.sgml (r1.16 -> r1.17) -------------- next part -------------- Index: maintenance.sgml =================================================================== RCS file: /usr/local/cvsroot/slony1/slony1-engine/doc/adminguide/maintenance.sgml,v retrieving revision 1.16 retrieving revision 1.17 diff -Ldoc/adminguide/maintenance.sgml -Ldoc/adminguide/maintenance.sgml -u -w -r1.16 -r1.17 --- doc/adminguide/maintenance.sgml +++ doc/adminguide/maintenance.sgml @@ -86,6 +86,39 @@ regularly, this script won't bother doing anything.</para> </sect2> +<sect2><title>Testing &slony1; State </title> + +<para> In the <filename>tools</filename> directory, you may find +scripts called <filename>test_slony_state.pl</filename> and +<filename>test_slony_state-dbi.pl</filename>. One uses the Perl/DBI +interface; the other uses the Pg interface. +</para> + +<para> Both do essentially the same thing, namely to connect to a +&slony1; node (you can pick any one), and from that, determine all the +nodes in the cluster. They then run a series of queries (read only, +so this should be quite safe to run) which look at the various +&slony1; tables, looking for a variety of sorts of conditions +suggestive of problems, including: +</para> + +<itemizedlist> +<listitem><para> Bloating of tables like pg_listener, sl_log_1, sl_log_2, sl_seqlog </para></listitem> +<listitem><para> Listen paths </para></listitem> +<listitem><para> Analysis of Event propagation </para></listitem> +<listitem><para> Analysis of Event confirmation propagation </para> + +<para> If communications is a <emphasis>little</emphasis> broken, +replication may happen, but confirmations may not get back, which +prevents nodes from clearing out old events and old replication +data. </para> </listitem> +</itemizedlist> + +<para> Running this once an hour or once a day can help you detect +symptoms of problems early, before they lead to performance +degradation. </para> +</sect2> + <sect2><title>Replication Test Scripts </title> <para> In the directory <filename>tools</filename> may be found four @@ -132,8 +165,8 @@ configuration in this file to connect to all those clusters.</para></listitem> -<listitem><para><command>nagios_slony_test</command> is a script -that was constructed to query the log files so that you might run the +<listitem><para><command>nagios_slony_test</command> is a script that +was constructed to query the log files so that you might run the replication tests every so often (we run them every 6 minutes), and then a system monitoring tool such as <ulink url="http://www.nagios.org/"> <productname>Nagios</productname> @@ -149,9 +182,50 @@ again.</para></listitem> </itemizedlist></para> - </sect2> +<sect2><title> Other Replication Tests </title> + +<para> The methodology of the previous section is designed with a view +to minimizing the cost of submitting replication test queries; on a +busy cluster, supporting hundreds of users, the cost associated with +running a few queries is likely to be pretty irrelevant, and the setup +cost to configure the tables and data injectors is pretty high.<para> + +<para> Three other methods for analyzing the state of replication have +stood out: + +<itemizedlist> + +<listitem><para> For an application-oriented test, it has been useful +to set up a view on some frequently updated table that pulls +application-specific information. </para> + +<para> For instance, one might look either at some statistics about a +most recently created application object, or an applicaton +transaction. For instance:</para> + +<para> <command> create view replication_test as select now() - +txn_time as age, object_name from transaction_table order by txn_time +desc limit 1; </command> </para> + +<para> <command> create view replication_test as select now() - +created_on as age, object_name from object_table order by id desc +limit 1; </command> </para> + +</listitem> + +<listitem><para> The &slony1;-defined view, <envar>sl_status</envar> +provides information as to how up to date different nodes are. Its +contents are only really interesting on origin nodes, as the events +generated on other nodes are generally ignorable. </para> +</listitem> + +<listitem><para> See also the <xref linkend="slonymrtg"> +discussion. </para></listitem> + +</itemizedlist> + <sect2><title> Log Files</title> <para><xref linkend="slon"> daemons generate some more-or-less verbose @@ -160,11 +234,13 @@ <itemizedlist> -<listitem><para> Use a log rotator like <productname>Apache</productname> -<application>rotatelogs</application> to have a sequence of log files so that no -one of them gets too big;</para></listitem> +<listitem><para> Use a log rotator like +<productname>Apache</productname> +<application>rotatelogs</application> to have a sequence of log files +so that no one of them gets too big;</para></listitem> -<listitem><para> Purge out old log files, periodically.</para></listitem> +<listitem><para> Purge out old log files, +periodically.</para></listitem> </itemizedlist> </para> Index: installation.sgml =================================================================== RCS file: /usr/local/cvsroot/slony1/slony1-engine/doc/adminguide/installation.sgml,v retrieving revision 1.21 retrieving revision 1.22 diff -Ldoc/adminguide/installation.sgml -Ldoc/adminguide/installation.sgml -u -w -r1.21 -r1.22 --- doc/adminguide/installation.sgml +++ doc/adminguide/installation.sgml @@ -2,11 +2,25 @@ <sect1 id="installation"> <title>&slony1; Installation</title> -<para>Note for &windows; users: Unless you are planning on hacking the -&slony1; code, it is highly recommended that you download and install -a prebuilt binary distribution and jump straight to the configuration -section below. +<note> <para>For &windows; users: Unless you are planning on hacking +the &slony1; code, it is highly recommended that you download and +install a prebuilt binary distribution and jump straight to the +configuration section below. There are likely to be links and/or +binaries at <ulink url="http://pgfoundry.org/projects/slony1/"> +pgFoundry &slony1; site </ulink> for official releases, the first of +which is expected to be &slony1; version 1.2.0. </para> + +<para> There are also RPM binaries available at that site for recent +versions of &slony1; for recent versions of &postgres;. </para> +</note> + +<warning><para> If you need &slony1; to do an upgrade from some +elderly version of &postgres; to a newer version, or if you need a +late-breaking CVS version, outside the context of a major release, +then be prepared to need to build both &postgres; and &slony1; from +source. The remainder of this section assumes this...</para> +</warning> <para>You should have obtained the &slony1; source from the previous step. Unpack it.</para>
- Previous message: [Slony1-commit] By cbbrowne: test scripts to analyze cluster status had trouble,
- Next message: [Slony1-commit] By cbbrowne: Reorganized FAQ into multiple <qandadiv> divisions
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-commit mailing list