Slony-I Upgrade

4.6. Slony-I Upgrade

Minor Slony-I versions can be upgraded using the slonik SLONIK UPDATE FUNCTIONS> command. This includes upgrades from 2.0.x to a newer version 2.0.y version.

When upgrading Slony-I, the installation on all nodes in a cluster must be upgraded at once, using the slonik command SLONIK UPDATE FUNCTIONS.

While this requires temporarily stopping replication, it does not forcibly require an outage for applications that submit updates.

The proper upgrade procedure is thus:

  • Stop the slon processes on all nodes. (e.g. - old version of slon)

  • Install the new version of slon software on all nodes.

  • Execute a slonik script containing the command update functions (id = [whatever]); for each node in the cluster.

    Note: Remember that your slonik upgrade script like all other slonik scripts must contain the proper preamble commands to function.

  • Start all slons.

The overall operation is relatively safe: If there is any mismatch between component versions, the slon will refuse to start up, which provides protection against corruption.

You need to be sure that the C library containing SPI trigger functions has been copied into place in the PostgreSQL build. There are multiple possible approaches to this:

The trickiest part of this is ensuring that the C library containing SPI functions is copied into place in the PostgreSQL build; the easiest and safest way to handle this is to have two separate PostgreSQL builds, one for each Slony-I version, where the postmaster is shut down and then restarted against the "new" build; that approach requires a brief database outage on each node.

While that approach has been found to be easier and safer, nothing prevents one from carefully copying Slony-I components for the new version into place to overwrite the old version as the "install" step. That might not work on Windows™ if it locks library files that are in use. It is also important to make sure that any connections to the database are restarted after the new binary is installed.

Run make install to install new Slony-I components on top of the old

If you build Slony-I on the same system on which it is to be deployed, and build from sources, overwriting the old with the new is as easy as make install. There is no need to restart a database backend; just to stop slon processes, run the UPDATE FUNCTIONS script, and start new slon processes.

Unfortunately, this approach requires having a build environment on the same host as the deployment. That may not be consistent with efforts to use common PostgreSQL and Slony-I binaries across a set of nodes.

Create a new PostgreSQL and Slony-I build

With this approach, the old PostgreSQL build with old Slony-I components persists after switching to a new PostgreSQL build with new Slony-I components. In order to switch to the new Slony-I build, you need to restart the PostgreSQL postmaster, therefore interrupting applications, in order to get it to be aware of the location of the new components.

4.6.1. Incompatibilities between 1.2 and 2.0

4.6.1.1. TABLE ADD KEY issue in Slony-I 2.0

The TABLE ADD KEY slonik command has been removed in version 2.0. This means that all tables must have a set of columns that form a unique key for the table. If you are upgrading from a previous Slony-I version and are using a Slony-I created primary key then you will need to modify your table to have its own primary key before installing Slony-I version 2.0

4.6.1.2. New Trigger Handling in Slony-I Version 2

One of the major changes to Slony-I is that enabling/disabling of triggers and rules now takes place as plain SQL, supported by PostgreSQL 8.3+, rather than via "hacking" on the system catalog.

As a result, Slony-I users should be aware of the PostgreSQL syntax for ALTER TABLE, as that is how they can accomplish what was formerly accomplished via SLONIK STORE TRIGGER and SLONIK DROP TRIGGER.

4.6.1.3. SUBSCRIBE SET goes to the origin

New in 2.0.5 (but not older versions of 2.0.x) is that SLONIK SUBSCRIBE SET commands are submitted by slonik to the set origin not the provider. This means that you only need to issue SLONIK WAIT FOR EVENT on the set origin to wait for the subscription process to complete.

4.6.1.4. WAIT FOR EVENT requires WAIT ON

With version 2.0 the WAIT FOR EVENT slonik command requires that the WAIT ON parameter be specified. Any slonik scripts that were assuming a default value will need to be modified

4.6.2. Upgrading to Slony-I version 2

The version 2 branch is substantially different from earlier releases, dropping support for versions of PostgreSQL prior to 8.3, as in version 8.3, support for a "session replication role" was added, thereby eliminating the need for system catalog hacks as well as the not-entirely-well-supported xxid data type.

As a result of the replacement of the xxid type with a (native-to-8.3) PostgreSQL transaction XID type, the slonik command SLONIK UPDATE FUNCTIONS is quite inadequate to the process of upgrading earlier versions of Slony-I to version 2.

In version 2.0.2, we have added a new option to SLONIK SUBSCRIBE SET, OMIT COPY, which allows taking an alternative approach to upgrade which amounts to:

  • Uninstall old version of Slony-I

    When Slony-I uninstalls itself, catalog corruptions are fixed back up.

  • Install Slony-I version 2

  • Resubscribe, with OMIT COPY

Warning

There is a large "foot gun" here: during part of the process, Slony-I is not installed in any form, and if an application updates one or another of the databases, the resubscription, omitting copying data, will be left with data out of sync.

The administrator must take care; Slony-I has no way to help ensure the integrity of the data during this process.

The following process is suggested to help make the upgrade process as safe as possible, given the above risks.

  • Use Section 5.1.10 to generate a slonik script to recreate the replication cluster.

    Be sure to verify the SLONIK ADMIN CONNINFO statements, as the values are pulled are drawn from the PATH configuration, which may not necessarily be suitable for running slonik.

    This step may be done before the application outage.

  • Determine what triggers have SLONIK STORE TRIGGER configuration on subscriber nodes.

    Trigger handling has fundamentally changed between Slony-I 1.2 and 2.0.

    Generally speaking, what needs to happen is to query sl_table on each node, and, for any triggers found in sl_table, it is likely to be appropriate to set up a script indicating either ENABLE REPLICA TRIGGER or ENABLE ALWAYS TRIGGER for these triggers.

    This step may be done before the application outage.

  • Begin an application outage during which updates should no longer be applied to the database.

  • To ensure that applications cease to make changes, it would be appropriate to lock them out via modifications to pg_hba.conf

  • Ensure replication is entirely caught up, via examination of the sl_status view, and any application data that may seem appropriate.

  • Shut down slon processes.

  • Uninstall the old version of Slony-I from the database.

    This involves running a slonik script that runs SLONIK UNINSTALL NODE against each node in the cluster.

  • Ensure new Slony-I binaries are in place.

    A convenient way to handle this is to have old and new in different directories alongside two PostgreSQL builds, stop the postmaster, repoint to the new directory, and restart the postmaster.

  • Run the script that reconfigures replication as generated earlier.

    This script should probably be split into two portions to be run separately:

    • Firstly, set up nodes, paths, sets, and such

    • At this point, start up slon processes

    • Then, run the portion which runs SLONIK SUBSCRIBE SET

    Splitting the Section 5.1.10 script as described above is left as an exercise for the reader.

  • If there were triggers that needed to be activated on subscriber nodes, this is the time to activate them.

  • At this point, the cluster should be back up and running, ready to be reconfigured so that applications may access it again.