- Table of Contents
- 5.1. Slony-I Administration Scripts
- 5.1.1. altperl Scripts
- 5.1.2. mkslonconf.sh
- 5.1.3. start_slon.sh
- 5.1.4. launch_clusters.sh
- 5.1.5. slony1_extract_schema.sh
- 5.1.6. slony-cluster-analysis
- 5.1.7. Generating slonik scripts using configure-replication.sh
- 5.1.8. slon.in-profiles
- 5.1.9. duplicate-node.sh
- 5.1.10. slonikconfdump.sh
- 5.1.11. Parallel to Watchdog: generate_syncs.sh
- 5.2. Frequently Asked Questions
- 5.3. Release Checklist
- 5.4. More Slony-I Help
A number of tools have grown over the course of the history of Slony-I to help users manage their clusters. This section along with the ones on Section 4.3 discusses them.
There is a set of scripts to simplify administeration of set of Slony-I instances. The scripts support having arbitrary numbers of nodes. They may be installed as part of the installation process:
This will produce a number of scripts with the prefix slonik_. They eliminate tedium by always referring to a central configuration file for the details of your site configuration. A documented sample of this file is provided in altperl/slon_tools.conf-sample. Most also include some command line help with the "--help" option, making them easier to learn and use.
Most generate Slonik scripts that are printed to STDOUT. At one time, the commands were passed directly to slonik for execution. Unfortunately, this turned out to be a pretty large calibre "foot gun", as minor typos on the command line led, on a couple of occasions, to pretty calamitous actions. The savvy administrator should review the script before piping it to slonik.
The UNIX environment variable SLONYNODES is used to determine what Perl configuration file will be used to control the shape of the nodes in a Slony-I cluster. If it is not provided, a default slon_tools.conf location will be referenced.
What variables are set up.
$CLUSTER_NAME=orglogs; # What is the name of the replication cluster?
$LOGDIR='/opt/OXRS/log/LOGDBS'; # What is the base directory for logs?
$APACHE_ROTATOR="/opt/twcsds004/OXRS/apache/rotatelogs"; # If set, where to find Apache log rotator
foldCase # If set to 1, object names (including schema names) will be folded to lower case. By default, your object names will be left alone. Note that PostgreSQL itself folds object names to lower case; if you create a table via the command CREATE TABLE SOME_THING (Id INTEGER, STudlYName text);, the result will be that all of those components are forced to lower case, thus equivalent to create table some_thing (id integer, studlyname text);, and the name of table and, in this case, the fields will all, in fact, be lower case.
You then define the set of nodes that are to be replicated
using a set of calls to
add_node (host => '10.20.30.40', dbname => 'orglogs', port => 5437, user => 'postgres', node => 4, parent => 1);
The set of parameters for
add_node() are thus:
my %PARAMS = (host=> undef, # Host name dbname => 'template1', # database name port => 5432, # Port number user => 'postgres', # user to connect as node => undef, # node number password => undef, # password for user parent => 1, # which node is parent to this node noforward => undef # shall this node be set up to forward results? sslmode => undef # SSL mode argument - determine # priority of SSL usage # = disable,allow,prefer,require );
The UNIX environment variable SLONYSET is used to determine what Perl configuration file will be used to determine what objects will be contained in a particular replication set.
Unlike SLONYNODES, which is essential for all of the slonik-generating scripts, this only needs to be set when running create_set, as that is the only script used to control what tables will be in a particular replication set.
Queries a database, generating output hopefully suitable for slon_tools.conf consisting of:
a set of
add_node()calls to configure the cluster
The arrays @KEYEDTABLES, nvar>@SERIALTnvar>, and @SEQUENCES
This generates just the "preamble" that is required by all slonik scripts. In effect, this provides a "skeleton" slonik script that does not do anything.
This requires SLONYSET to be set as well as SLONYNODES; it is used to generate the slonik script to set up a replication set consisting of a set of tables and sequences that are to be replicated.
Generates Slonik script to drop a node from a Slony-I cluster.
Generates Slonik script to drop a replication set (e.g. - set of tables and sequences) from a Slony-I cluster.
This represents a pretty big potential "foot gun" as this eliminates a replication set all at once. A typo that points it to the wrong set could be rather damaging. Compare to Section 126.96.36.199 and Section 188.8.131.52; with both of those, attempting to drop a subscription or a node that is vital to your operations will be blocked (via a foreign key constraint violation) if there exists a downstream subscriber that would be adversely affected. In contrast, there will be no warnings or errors if you drop a set; the set will simply disappear from replication.
Generates Slonik script to drop a table from replication. Requires, as input, the ID number of the table (available from table sl_table) that is to be dropped.
Generates Slonik script to drop a sequence from replication. Requires, as input, the ID number of the table (available from table sl_table) that is to be dropped, and the ID of the set to which it is attached.
Generates Slonik script to push DDL changes to a replication set.
Generates Slonik script to request failover from a dead node to some new origin
Generates Slonik script to initialize a whole Slony-I cluster, including setting up the nodes, communications paths, and the listener routing.
Generates Slonik script to merge two replication sets together.
Generates Slonik script to move the origin of a particular set to a different node.
Script to test whether Slony-I is successfully replicating data.
Generates Slonik script to request the restart of a node. This was particularly useful pre-1.0.5 when nodes could get snarled up when slon daemons died.
Generates Slonik script to restart all nodes in the cluster. Not particularly useful.
Displays an overview of how the environment (e.g. - SLONYNODES) is set to configure things.
Kills slony watchdog and all slon daemons for the specified set. It only works if those processes are running on the local host, of course!
This starts a slon daemon for the specified cluster and node, and uses slon_watchdog to keep it running.
Used by slon_start.
This is a somewhat smarter watchdog; it monitors a particular Slony-I node, and restarts the slon process if it hasn't seen updates go in in 20 minutes or more.
This is helpful if there is an unreliable network connection such that the slon sometimes stops working without becoming aware of it.
Adds a node to an existing cluster.
Generates Slonik script to subscribe a particular node to a particular replication set.
This goes through and drops the Slony-I schema from each node; use this if you want to destroy replication throughout a cluster. As its effects are necessarily rather destructive, this has the potential to be pretty unsafe.
Generates Slonik script to unsubscribe a node from a replication set.
Generates Slonik script to tell all the nodes to update the Slony-I functions. This will typically be needed when you upgrade from one version of Slony-I to another.
This is a shell script designed to rummage through a Slony-I cluster and generate a set of slon.conf files that slon accesses via the slon -f slon.conf option.
With all of the configuration residing in a configuration file for each slon, they can be invoked with minimal muss and fuss, with no risk of forgetting the -a option and thereby breaking a log shipping node.
Running it requires the following environment configuration:
Firstly, the environment needs to be set up with suitable parameters for libpq to connect to one of the databases in the cluster. Thus, you need some suitable combination of the following environment variables set:
SLONYCLUSTER - the name of the Slony-I cluster to be "rummaged".
MKDESTINATION - a directory for configuration to reside in; the script will create MKDESTINATION/$SLONYCLUSTER/conf for the slon configuration files, and MKDESTINATION/$SLONYCLUSTER/pid for slon to store PID files in.
LOGHOME - a directory for log files to reside in; a directory of the form $LOGHOME/$SLONYCLUSTER/node[number] will be created for each node.
For any "new" nodes that it discovers, this script will create a new slon conf file.
It is fair to say that there are several conditions to beware of; none of these should be greatly surprising...
The file layout set up by mkslonconf.sh was specifically set up to allow managing slons across a multiplicity of clusters using the script in the following section...
This rc.d-style script was introduced in Slony-I version 2.0; it provides automatable ways of:
Starting the slon, via start_slon.sh start
Attempts to start the slon, checking first to verify that it is not already running, that configuration exists, and that the log file location is writable. Failure cases include:
No slon runtime configuration file exists,
A slon is found with the PID indicated via the runtime configuration,
The specified SLON_LOG location is not writable.
Stopping the slon, via start_slon.sh stop
This fails (doing nothing) if the PID (indicated via the runtime configuration file) does not exist;
Monitoring the status of the slon, via start_slon.sh status
This indicates whether or not the slon is running, and, if so, prints out the process ID.
The following environment variables are used to control slon configuration:
This indicates where the slon binary program is found.
This file is the location where slon log files are to be stored, if need be. There is an option slon_conf_syslog for slon to use syslog to manage logging; in that case, you may prefer to set SLON_LOG to /dev/null.
Note that these environment variables may either be set, in the script, or overridden by values passed in from the environment. The latter usage makes it easy to use this script in conjunction with the regression tests so that it is regularly tested.
This is another shell script which uses the configuration as set up by mkslonconf.sh and is intended to support an approach to running Slony-I involving regularly (e.g. via a cron process) checking to ensure that slon processes are running.
It uses the following environment variables:
PATH which needs to contain, preferably at the beginning, a path to the slon binaries that should be run.
SLHOME indicates the "home" directory for slon configuration files; they are expected to be arranged in subdirectories, one for each cluster, with filenames of the form node1.conf, node2.conf, and such
The script uses the command find $SLHOME/$cluster/conf -name "node[0-9]*.conf" to find slon configuration files.
If you remove some of these files, or rename them so their names do not conform to the find command, they won't be found; that is an easy way to drop nodes out of this system.
LOGHOME indicates the "home" directory for log storage.
This script does not assume the use of the Apache log rotator to manage logs; in that PostgreSQL version 8 does its own log rotation, it seems undesirable to retain a dependancy on specific log rotation "technology."
CLUSTERS is a list of Slony-I clusters under management.
In effect, you could run this every five minutes, and it would launch any missing slon processes.
You may find that you wish to create a new node some time well after creating a cluster. The script slony1_extract_schema.sh will help you with this.
A command line might look like the following:
PGPORT=5881 PGHOST=master.int.example.info ./slony1_extract_schema.sh payroll payroll temppayroll
It performs the following:
It dumps the origin node's schema, including the data in the Slony-I cluster schema.
Note that the extra environment variables PGPORT and PGHOST to indicate additional information about where the database resides.
This data is loaded into the freshly created temporary database, temppayroll
The table and sequence OIDs in Slony-I tables are corrected to point to the temporary database's configuration.
A slonik script is run to perform SLONIK UNINSTALL NODE on the temporary database. This eliminates all the special Slony-I tables, schema, and removes Slony-I triggers from replicated tables.
Finally, pg_dump is run against the temporary database, delivering a copy of the cleaned up schema to standard output.
If you are running a lot of replicated databases, where there are numerous Slony-I clusters, it can get painful to track and document this. The following tools may be of some assistance in this.
slony-cluster-analysis.sh is a shell script intended to provide some over-time analysis of the configuration of a Slony-I cluster. You pass in the usual libpq environment variables (PGHOST, PGPORT, PGDATABASE, and such) to connect to a member of a Slony-I cluster, and pass the name of the cluster as an argument.
The script then does the following:
Runs a series of queries against the Slony-I tables to get lists of nodes, paths, sets, and tables.
This is stowed in a temporary file in /tmp
A comparison is done between the present configuration and the configuration the last time the tool was run. If the configuration differs, an email of the difference (generated using diff) is sent to a configurable email address.
If the configuration has changed, the old configuration file is renamed to indicate when the script noticed the change.
Ultimately, the current configuration is stowed in LOGDIR in a filename like cluster.last
There is a sample "wrapper" script, slony-cluster-analysis-mass.sh, which sets things up to point to a whole bunch of Slony-I clusters.
This should make it easier for a group of DBAs to keep track of two things:
Documenting the current state of system configuration.
Noticing when configuration changes.
The tools script configure-replication.sh is intended to automate generating slonik scripts to configure replication.
This script uses a number (possibly large, if your configuration needs to be particularly complex) of environment variables to determine the shape of the configuration of a cluster. It uses default values extensively, and in many cases, relatively few environment values need to be set in order to get a viable configuration.
There are some values that will be used universally across a cluster:
Name of Slony-I cluster
Number of nodes to set up
name of PostgreSQL superuser controlling replication
default port number
default database name
a list of fully qualified table names (e.g. - complete with namespace, such as public.my_table)
a list of fully qualified sequence names (e.g. - complete with namespace, such as public.my_sequence)
Defaults are provided for all of these values, so that if you run configure-replication.sh without setting any environment variables, you will get a set of slonik scripts. They may not correspond, of course, to any database you actually want to use...
For each node, there are also four environment variables; for node 1:
database to connect to
superuser to connect as
It is quite likely that DB*, USER*, and PORT* should be drawn from the global PGDATABASE, PGUSER, and PGPORT values above; having the discipline of that sort of uniformity is usually a good thing.
In contrast, HOST* values should be set explicitly for HOST1, HOST2, ..., as you don't get much benefit from the redundancy replication provides if all your databases are on the same server!
slonik config files are generated in a temp directory under /tmp. The usage is thus:
preamble.slonik is a "preamble" containing connection info used by the other scripts.
Verify the info in this one closely; you may want to keep this permanently to use with future maintenance you may want to do on the cluster.
This is the first script to run; it sets up the requested nodes as being Slony-I nodes, adding in some Slony-I-specific config tables and such.
You can/should start slon processes any time after this step has run.
This is the second script to run; it indicates how the slons should intercommunicate. It assumes that all slons can talk to all nodes, which may not be a valid assumption in a complexly-firewalled environment. If that assumption is untrue, you will need to modify the script to fix the paths.
This sets up the replication set consisting of the whole bunch of tables and sequences that make up your application's database schema.
When you run this script, all that happens is that triggers are added on the origin node (node #1) that start collecting updates; replication won't start until #5...
There are two assumptions in this script that could be invalidated by circumstances:
That all of the tables and sequences have been included.
This becomes invalid if new tables get added to your schema and don't get added to the TABLES list.
That all tables have been defined with primary keys.
Best practice is to always have and use true primary keys. If you have tables that require choosing a candidate primary key or that require creating a surrogate key using SLONIK TABLE ADD KEY, you will have to modify this script by hand to accomodate that.
And 3, and 4, and 5, if you set the number of nodes higher...
This is the step that "fires up" replication.
The assumption that the script generator makes is that all the subscriber nodes will want to subscribe directly to the origin node. If you plan to have "sub-clusters," perhaps where there is something of a "master" location at each data centre, you may need to revise that.
The slon processes really ought to be running by the time you attempt running this step. To do otherwise would be rather foolish.
In the tools area, slon.in-profiles is a script that might be used to start up slon instances at the time of system startup. It is designed to interact with the FreeBSD Ports system.
In the tools area, duplicate-node.sh is a script that may be used to help create a new node that duplicates one of the ones in the cluster.
The script expects the following parameters:
New node number
Node being duplicated
For each of the nodes specified, the script offers flags to
libpq-style parameters for
PGDATABASE, and PGUSER; it is expected
that .pgpass will be used for storage of
passwords, as is generally considered best practice. Those values may
inherit from the
libpq environment variables, if
not set, which is useful when using this for testing. When
"used in anger," however, it is likely that nearly all of
the 14 available parameters should be used.
The script prepares files, normally in /tmp, and will report the name of the directory that it creates that contain SQL and slonik scripts to set up the new node.
This is drawn from the origin node, and contains the "pristine" database schema that must be applied first.
This "preamble" is used by the subsequent set of slonik scripts.
A slonik script to set up the new node.
A slonik script to set up path communications between the provider node and the new node.
A slonik script to request subscriptions for all replications sets.
For testing purposes, this is sufficient to get a new node working. The configuration may not necessarily reflect what is desired as a final state:
Additional communications paths may be desirable in order to have redundancy.
It is assumed, in the generated scripts, that the new node should support forwarding; that may not be true.
It may be desirable later, after the subscription process is complete, to revise subscriptions.
The tool tools/slonikconfdump.sh was created to help dump out a slonik script to duplicate the configuration of a functioning Slony-I cluster. It should be particularly useful when upgrading Slony-I to version 2.0; see Section 4.6.2 for more details.
It dumps out:
Node connection information
Note that it uses the first value it finds (e.g. - for the lowest numbered client node).
Note that the subscriptions are ordered topologically, using tsort
It may be run as follows:
chris@dba2:Slony-I/CMD/slony1-2.0/tools> SLONYCLUSTER=slony_regress1 PGDATABASE=slonyregress1 bash slonikconfdump.sh # building slonik config files for cluster slony_regress1 # generated by: slonikconfdump.sh # Generated on: Tue Jun 9 17:34:12 EDT 2009 cluster name=slony_regress1; include <admin-conninfos.slonik>; # Draw in ADMIN CONNINFO lines node 1 admin conninfo='dbname=slonyregress1 host=localhost user=chris port=7083'; node 2 admin conninfo='dbname=slonyregress2 host=localhost user=chris port=7083'; init cluster (id=1, comment='Regress test node'); store node (id=2, comment='node 2'); store path (server=1, client=2, conninfo='dbname=slonyregress1 host=localhost user=chris port=7083', connretry=10); store path (server=2, client=1, conninfo='dbname=slonyregress2 host=localhost user=chris port=7083', connretry=10); create set (id=1, origin=1, comment='All test1 tables'); set add table (id=1, set id=1, origin=1, fully qualified name='"public"."table1"', comment='accounts table, key='table1_pkey'); set add table (id=2, set id=1, origin=1, fully qualified name='"public"."table2"', comment='public.table2, key='table2_id_key'); set add table (id=4, set id=1, origin=1, fully qualified name='"public"."table4"', comment='a table of many types, key='table4_pkey'); set add table (id=5, set id=1, origin=1, fully qualified name='"public"."table5"', comment='a table with composite PK strewn across the table, key='table5_pkey'); subscribe set (id=1, provider=1, receiver=2, forward=YES); chris@dba2:Slony-I/CMD/slony1-2.0/tools>
The output should be reviewed before it is applied elsewhere. Particular attention should be paid to the ADMIN CONNINFO, as it picks the first value that it sees for each node; in a complex environment, where visibility of nodes may vary from subnet to subnet, it may not pick the right value.
A new script for Slony-I 1.1 is generate_syncs.sh, which addresses the following kind of situation.
Supposing you have some possibly-flakey server where the slon daemon that might not run all the time, you might return from a weekend away only to discover the following situation.
On Friday night, something went "bump" and while the database came back up, none of the slon daemons survived. Your online application then saw nearly three days worth of reasonably heavy transaction load.
When you restart slon on Monday, it hasn't done a SYNC on the master since Friday, so that the next "SYNC set" comprises all of the updates between Friday and Monday. Yuck.
If you run generate_syncs.sh as a cron job every 20 minutes, it will force in a periodic SYNC on the origin, which means that between Friday and Monday, the numerous updates are split into more than 100 syncs, which can be applied incrementally, making the cleanup a lot less unpleasant.
Note that if SYNCs are running regularly, this script won't bother doing anything.