23. Log Analysis

Here are some of things that you may find in your Slony-I logs, and explanations of what they mean.

23.1. CONFIG notices

These entries are pretty straightforward. They are informative messages about your configuration.

Here are some typical entries that you will probably run into in your logs:

CONFIG main: local node id = 1
CONFIG main: loading current cluster configuration
CONFIG storeNode: no_id=3 no_comment='Node 3'
CONFIG storePath: pa_server=5 pa_client=1 pa_conninfo="host=127.0.0.1 dbname=foo user=postgres port=6132" pa_connretry=10
CONFIG storeListen: li_origin=3 li_receiver=1 li_provider=3
CONFIG storeSet: set_id=1 set_origin=1 set_comment='Set 1'
CONFIG main: configuration complete - starting threads

23.2. DEBUG Notices

Debug notices are always prefaced by the name of the thread that the notice originates from. You will see messages from the following threads:

localListenThread

This is the local thread that listens for events on the local node.

remoteWorkerThread-X

The thread processing remote events. You can expect to see one of these for each node that this node communicates with.

remoteListenThread-X

Listens for events on a remote node database. You may expect to see one of these for each node in the cluster.

cleanupThread

Takes care of things like vacuuming, cleaning out the confirm and event tables, and deleting old data.

syncThread

Generates SYNC events.

23.3. How to read Slony-I logs

Note that as far as slon is concerned, there is no "master" or "slave." They are just nodes.

What you can expect, initially, is to see, on both nodes, some events propagating back and forth. Firstly, there should be some events published to indicate creation of the nodes and paths. If you don't see those, then the nodes aren't properly communicating with one another, and nothing else will happen...

After that, you'll mainly see two sorts of behaviour:

23.4. Log Messages and Implications

This section lists numerous of the error messages found in Slony-I, along with a brief explanation of implications. It is a fairly well comprehensive list, leaving out mostly some of the DEBUG4 messages that are generally uninteresting.

23.4.1. Log Messages Associated with Log Shipping

Most of these represent errors that come up if the Section 14 functionality breaks. You may expect things to break if the filesystem being used for log shipping fills, or if permissions on that directory are wrongly set.

  • ERROR: remoteWorkerThread_%d: log archive failed %s - %s\n

    This indicates that an error was encountered trying to write a log shipping file. Normally the slon will retry, and hopefully succeed.

  • DEBUG2: remoteWorkerThread_%d: writing archive log...

    This indicates that a log shipping archive log is being written for a particular SYNC set.

  • INFO: remoteWorkerThread_%d: Run Archive Command %s

    If slon has been configured (-x aka command_on_logarchive) to run a command after generating each log shipping archive log, this reports when that process is spawned using system().

  • ERROR: remoteWorkerThread_%d: Could not open COPY SET archive file %s - %s

    Seems pretty self-explanatory...

  • ERROR: remoteWorkerThread_%d: Could not generate COPY SET archive header %s - %s

    Probably means that we just filled up a filesystem...

23.4.2. Log Messages - DDL scripts

The handling of DDL is somewhat fragile, as described in Section 15; here are both informational and error messages that may occur in the progress of an EXECUTE SCRIPT request.

  • ERROR: remoteWorkerThread_%d: DDL preparation failed - set %d - only on node %

    Something broke when applying a DDL script on one of the nodes. This is quite likely indicates that the node's schema differed from that on the origin; you may need to apply a change manually to the node to allow the event to proceed. The scary, scary alternative might be to delete the offending event, assuming it can't possibly work...

  • SLON_CONFIG: remoteWorkerThread_%d: DDL request with %d statements

    This is informational, indicating how many SQL statements were processed.

  • SLON_ERROR: remoteWorkerThread_%d: DDL had invalid number of statements - %d

    Occurs if there were < 0 statements (which should be impossible) or > MAXSTATEMENTS statements. Probably the script was bad...

  • ERROR: remoteWorkerThread_%d: malloc() failure in DDL_SCRIPT - could not allocate %d bytes of memory

    This should only occur if you submit some extraordinarily large DDL script that makes a slon run out of memory

  • CONFIG: remoteWorkerThread_%d: DDL Statement %d: [%s]

    This lists each DDL statement as it is submitted.

  • ERROR: DDL Statement failed - %s

    Oh, dear, one of the DDL statements that worked on the origin failed on this remote node...

  • CONFIG: DDL Statement success - %s

    All's well...

  • ERROR: remoteWorkerThread_%d: Could not generate DDL archive tracker %s - %s

    Apparently the DDL script couldn't be written to a log shipping file...

  • ERROR: remoteWorkerThread_%d: Could not submit DDL script %s - %s

    Couldn't write the script to a log shipping file.

  • ERROR: remoteWorkerThread_%d: Could not close DDL script %s - %s

    Couldn't close a log shipping file for a DDL script.

  • ERROR: Slony-I ddlScript_prepare(): set % not found

    Set wasn't found on this node; you probably gave the wrong ID number...

  • ERROR: Slony-I ddlScript_prepare_int(): set % not found

    Set wasn't found on this node; you probably gave the wrong ID number...

  • ERROR: Slony-I: alterTableForReplication(): Table with id % not found

    Apparently the table wasn't found; could the schema be messed up?

  • ERROR: Slony-I: alterTableForReplication(): Table % is already in altered state

    Curious... We're trying to set a table up for replication a second time?

  • ERROR: Slony-I: alterTableRestore(): Table with id % not found

    This runs when a table is being restored to "non-replicated" state; apparently the replicated table wasn't found.

  • ERROR: Slony-I: alterTableRestore(): Table % is not in altered state

    Hmm. The table isn't in altered replicated state. That shouldn't be, if replication had been working properly...

  • NOTICE: Slony-I: alterTableForReplication(): multiple instances of trigger % on table %'',

    This normally happens if you have a table that had a trigger attached to it that replication hid due to this being a subscriber node, and then you added a trigger by the same name back to replication. Now, when trying to "fix up" triggers, those two triggers conflict.

    The DDL script will keep running and rerunning, or the UNINSTALL NODE will keep failing, until you drop the "visible" trigger, by hand, much as you must have added it, by hand, earlier.

  • ERROR: Slony-I: Unable to disable triggers

    This is the error that follows the "multiple triggers" problem.

23.4.3. Threading Issues

There should not be any "user-serviceable" aspects to the Slony-I threading model; each slon creates a quite well-specified set of helper threads to manage the various database connections that it requires. The only way that anything should break on the threading side is if you have not compiled PostgreSQL libraries to "play well" with threading, in which case you will be unable to compile Slony-I in the first place.

  • FATAL: remoteWorkerThread_%d: pthread_create() - %s

    Couldn't create a new remote worker thread.

  • DEBUG1 remoteWorkerThread_%d: helper thread for provider %d created

    This normally happens when the slon starts: a thread is created for each node to which the local node should be listening for events.

  • DEBUG1: remoteWorkerThread_%d: helper thread for provider %d terminated

    If subscriptions reshape such that a node no longer provides a subscription, then the thread that works on that node can be dropped.

  • DEBUG1: remoteWorkerThread_%d: disconnecting from data provider %d

    A no-longer-used data provider may be dropped; if connection information is changed, the slon needs to disconnect and reconnect.

  • DEBUG2: remoteWorkerThread_%d: ignore new events due to shutdown

    If the slon is shutting down, it is futile to process more events

  • DEBUG1: remoteWorkerThread_%d: node %d - no worker thread

    Curious: we can't wake up the worker thread; there probably should already be one...

23.4.4. Log Entries At Subscription Time

Subscription time is quite a special time in Slony-I. If you have a large amount of data to be copied to subscribers, this may take a considerable period of time. Slony-I logs a fairly considerable amount of information about its progress, which is sure to be useful to the gentle reader. In particular, it generates log output every time it starts and finishes copying data for a given table as well as when it completes reindexing the table. That may not make a 28 hour subscription go any faster, but at least helps you have some idea of how it is progressing.

  • DEBUG1: copy_set %d

    This indicates the beginning of copying data for a new subscription.

  • ERROR: remoteWorkerThread_%d: set %d not found in runtime configuration

    slon tried starting up a subscription; it couldn't find conninfo for the data source. Perhaps paths are not properly propagated?

  • ERROR: remoteWorkerThread_%d: node %d has no pa_conninfo

    Apparently the conninfo configuration was wrong...

  • ERROR: copy set %d cannot connect to provider DB node %d

    slon couldn't connect to the provider. Is the conninfo wrong? Or perhaps authentication is misconfigured? Or perhaps the database is down?

  • DEBUG1: remoteWorkerThread_%d: connected to provider DB

    Excellent: the copy set has a connection to its provider

  • ERROR: Slony-I: sequenceSetValue(): sequence % not found

    Curious; the sequence object is missing. Could someone have dropped it from the schema by hand (e.g. - not using EXECUTE SCRIPT)?

  • ERROR: Slony-I: subscribeSet() must be called on provider

    This function should only get called on the provider node. slonik normally handles this right, unless one had wrong DSNs in a slonik script...

  • ERROR: Slony-I: subscribeSet(): set % not found

    Hmm. The provider node isn't aware of this set. Wrong parms to a slonik script?

  • ERROR: Slony-I: subscribeSet(): set origin and receiver cannot be identical

    Duh, an origin node can't subscribe to itself.

  • ERROR: Slony-I: subscribeSet(): set provider and receiver cannot be identical

    A receiver must subscribe to a different node...

  • Slony-I: subscribeSet(): provider % is not an active forwarding node for replication set %

    You can only use a live, active, forwarding provider as a data source.

  • Slony-I: subscribeSet_int(): set % is not active, cannot change provider

    You can't change the provider just yet...

  • Slony-I: subscribeSet_int(): set % not found

    This node isn't aware of the set... Perhaps you submitted wrong parms?

  • Slony-I: unsubscribeSet() must be called on receiver

    Seems obvious... This probably indicates a bad slonik admin DSN...

  • Slony-I: Cannot unsubscribe set % while being provider

    This should seem obvious; UNSUBSCRIBE SET will fail if a node has dependent subscribers for which it is the provider

  • Slony-I: cleanupEvent(): Single node - deleting events < %

    If there's only one node, the cleanup event will delete old events so that you don't get "build-up of crud."

  • Slony-I: tableAddKey(): table % not found

    Perhaps you didn't copy the schema over properly?

  • Slony-I: tableDropKey(): table with ID% not found

    Seems curious; you were presumably replicating to this table, so for this to be gone seems rather odd...

  • Slony-I: determineIdxnameUnique(): table % not found

    Did you properly copy over the schema to a new node???

  • Slony-I: table % has no primary key

    This likely signifies a bad loading of schema...

  • Slony-I: table % has no unique index %

    This likely signifies a bad loading of schema...

  • WARN: remoteWorkerThread_%d: transactions earlier than XID %s are still in progress

    This indicates that some old transaction is in progress from before the earliest available SYNC on the provider. Slony-I cannot start replicating until that transaction completes. This will repeat until thetransaction completes...

  • DEBUG2: remoteWorkerThread_%d: prepare to copy table %s

    This indicates that slon is beginning preparations to set up subscription for a table.

  • DEBUG1: remoteWorkerThread_%d: table %s will require Slony-I serial key

    Evidently this is a table defined with TABLE ADD KEY where Slony-I has to add a surrogate primary key.

  • ERROR: remoteWorkerThread_%d: Could not lock table %s on subscriber

    For whatever reason, the table could not be locked, so the subscription needs to be restarted. If the problem was something like a deadlock, retrying may help. If the problem was otherwise, you may need to intervene...

  • DEBUG2: remoteWorkerThread_%d: all tables for set %d found on subscriber

    An informational message indicating that the first pass through the tables found no problems...

  • DEBUG2: remoteWorkerThread_%d: copy sequence %s

    Processing some sequence...

  • DEBUG2: remoteWorkerThread_%d: copy table %s

    slon is starting to copy a table...

  • DEBUG3: remoteWorkerThread_%d: table %s Slony-I serial key added local

    Just added new column to the table to provide surrogate primary key.

  • DEBUG3: remoteWorkerThread_%d: local table %s already has Slony-I serial key

    Did not need to add serial key; apparently it was already there.

  • DEBUG3: remoteWorkerThread_%d: table %s does not require Slony-I serial key

    Apparently this table didn't require a special serial key...

  • DEBUG3: remoteWorkerThread_%d: table %s Slony-I serial key added local

  • DEBUG2: remoteWorkerThread_%d: Begin COPY of table %s

    slon is about to start the COPY on both sides to copy a table...

  • ERROR: remoteWorkerThread_%d: Could not generate copy_set request for %s - %s

    This indicates that the delete/copy requests failed on the subscriber. The slon will repeat the COPY_SET attempt; it will probably continue to fail..

  • ERROR: remoteWorkerThread_%d: copy to stdout on provider - %s %s

    Evidently something about the COPY to stdout on the provider node broke... The event will be retried...

  • ERROR: remoteWorkerThread_%d: copy from stdin on local node - %s %s

    Evidently something about the COPY into the table on the subscriber node broke... The event will be retried...

  • DEBUG2: remoteWorkerThread_%d: %d bytes copied for table %s

    This message indicates that the COPY of the table has completed. This is followed by running ANALYZE and reindexing the table on the subscriber.

  • DEBUG2: remoteWorkerThread_%d: %.3f seconds to copy table %s

    After this message, copying and reindexing and analyzing the table on the subscriber is complete.

  • DEBUG2: remoteWorkerThread_%d: set last_value of sequence %s (%s) to %s

    As should be no surprise, this indicates that a sequence has been processed on the subscriber.

  • DEBUG2: remoteWorkerThread_%d: %.3 seconds to copy sequences

    Summarizing the time spent processing sequences in the COPY_SET event.

  • ERROR: remoteWorkerThread_%d: query %s did not return a result

    This indicates that the query, as part of final processing of COPY_SET, failed. The copy will restart...

  • DEBUG2: remoteWorkerThread_%d: copy_set no previous SYNC found, use enable event

    This takes place if no past SYNC event was found; the current event gets set to the event point of the ENABLE_SUBSCRIPTION event.

  • DEBUG2: remoteWorkerThread_%d: copy_set SYNC found, use event seqno %s

    This takes place if a SYNC event was found; the current event gets set as shown.

  • ERROR: remoteWorkerThread_%d: sl_setsync entry for set %d not found on provider

    SYNC synchronization information was expected to be drawn from an existing subscriber, but wasn't found. Something replication-breakingly-bad has probably happened...

  • DEBUG1: remoteWorkerThread_%d: could not insert to sl_setsync_offline

    Oh, dear. After setting up a subscriber, and getting pretty well everything ready, some writes to a log shipping file failed. Perhaps disk filled up...

  • DEBUG1: remoteWorkerThread_%d: %.3f seconds to build initial setsync status

    Indicates the total time required to get the copy_set event finalized...

  • DEBUG1: remoteWorkerThread_%d: disconnected from provider DB

    At the end of a subscribe set event, the subscriber's slon will disconnect from the provider, clearing out connections...

  • DEBUG1: remoteWorkerThread_%d: copy_set %d done in %.3f seconds

    Indicates the total time required to complete copy_set... This indicates a successful subscription!

  • DEBUG1: remoteWorkerThread_%d: copy_set %d done in %.3f seconds

    Indicates the total time required to complete copy_set... This indicates a successful subscription!

23.4.5. Log Entries Associated with MERGE SET

These various exceptions cause MERGE SET to be rejected; something ought to be corrected before submitting the request again.

  • ERROR: Slony-I: merged set ids cannot be identical

    It is illogical to try to merge a set with itself.

  • ERROR: Slony-I: set % not found

    A missing set cannot be merged.

  • ERROR: Slony-I: set % does not originate on local node

    The MERGE SET request must be submitted to the origin node for the sets that are to be merged.

  • ERROR: Slony-I: subscriber lists of set % and % are different

    Sets can only be merged if they have identical subscriber lists.

  • ERROR: Slony-I: set % has subscriptions in progress - cannot merge

    MERGE SET cannot proceed until all subscriptions have completed processing. If this message arises, that indicates that the subscriber lists are the same, but that one or more of the nodes has not yet completed setting up its subscription. It may be that waiting a short while will permit resubmitting the MERGE SET request.

23.4.6. Log Entries Associated With Normal SYNC activity

Some of these messages indicate exceptions, but the "normal" stuff represents what you should expect to see most of the time when replication is just plain working.

  • DEBUG2: remoteWorkerThread_%d: forward confirm %d,%s received by %d

    These events should occur frequently and routinely as nodes report confirations of the events they receive.

  • DEBUG1: remoteWorkerThread_%d: SYNC %d processing

    This indicates the start of processing of a SYNC

  • ERROR: remoteWorkerThread_%d: No pa_conninfo for data provider %d

    Oh dear, we haven't connection information to connect to the data provider. That shouldn't be possible, normally...

  • ERROR: remoteWorkerThread_%d: cannot connect to data provider %d on 'dsn'

    Oh dear, we haven't got correct connection information to connect to the data provider.

  • DEBUG1: remoteWorkerThread_%d: connected to data provider %d on 'dsn'

    Excellent; the slon has connected to the provider.

  • WARN: remoteWorkerThread_%d: don't know what ev_seqno node %d confirmed for ev_origin %d

    There's no confirmation information available for this node's provider; need to abort the SYNC and wait a bit in hopes that that information will emerge soon...

  • DEBUG1: remoteWorkerThread_%d: data provider %d only confirmed up to ev_seqno %d for ev_origin %d

    The provider for this node is a subscriber, and apparently that subscriber is a bit behind. The slon will need to wait for the provider to catch up until it has new data.

  • DEBUG2: remoteWorkerThread_%d: data provider %d confirmed up to ev_seqno %s for ev_origin %d - OK

    All's well; the provider should have the data that the subscriber needs...

  • DEBUG2: remoteWorkerThread_%d: syncing set %d with %d table(s) from provider %d

    This is declaring the plans for a SYNC: we have a set with some tables to process.

  • DEBUG2: remoteWorkerThread_%d: ssy_action_list value: %s length: %d

    This portion of the query to collect log data to be applied has been known to "bloat up"; this shows how it has gotten compressed...

  • DEBUG2: remoteWorkerThread_%d: Didn't add OR to provider

    This indicates that there wasn't anything in a "provider" clause in the query to collect log data to be applied, which shouldn't be. Things are quite likely to go bad at this point...

  • DEBUG2: remoteWorkerThread_%d: no sets need syncing for this event

    This will be the case for all SYNC events generated on nodes that are not originating replication sets. You can expect to see these messages reasonably frequently.

  • DEBUG3: remoteWorkerThread_%d: activate helper %d

    We're about to kick off a thread to help process SYNC data...

  • DEBUG4: remoteWorkerThread_%d: waiting for log data

    The thread is waiting to get data to consume (e.g. - apply to the replica).

  • ERROR: remoteWorkerThread_%d: %s %s - qualification was %s

    Apparently an application of replication data to the subscriber failed... This quite likely indicates some sort of serious corruption.

  • ERROR: remoteWorkerThread_%d: replication query did not affect one row (cmdTuples = %s) - query was: %s qualification was: %s

    If SLON_CHECK_CMDTUPLES is set, slon applies changes one tuple at a time, and verifies that each change affects exactly one tuple. Apparently that wasn't the case here, which suggests a corruption of replication. That's a rather bad thing...

  • ERROR: remoteWorkerThread_%d: SYNC aborted

    If any errors have been encountered that haven't already aborted the SYNC, this catches and aborts it.

  • DEBUG2: remoteWorkerThread_%d: new sl_rowid_seq value: %s

    This marks the progression of this internal Slony-I sequence.

  • DEBUG2: remoteWorkerThread_%d: SYNC %d done in %.3f seconds

    This indicates the successful completion of a SYNC. Hurray!

  • DEBUG1: remoteWorkerThread_%d_d:%.3f seconds delay for first row

    This indicates how long it took to get the first row from the LOG cursor that pulls in data from the sl_log tables.

  • ERROR: remoteWorkerThread_%d_d: large log_cmddata for actionseq %s not found

    slon could not find the data for one of the "very large" sl_log table tuples that are pulled individually. This shouldn't happen.

  • DEBUG2: remoteWorkerThread_%d_d:%.3f seconds until close cursor

    This indicates how long it took to complete reading data from the LOG cursor that pulls in data from the sl_log tables.

  • DEBUG2: remoteWorkerThread_%d_d: inserts=%d updates=%d deletes=%d

    This reports how much activity was recorded in the current SYNC set.

  • DEBUG3: remoteWorkerThread_%d: compress_actionseq(list,subquery) Action list: %s

    This indicates a portion of the LOG cursor query that is about to be compressed. (In some cases, this could grow to enormous size, blowing up the query parser...)

  • DEBUG3: remoteWorkerThread_%d: compressed actionseq subquery %s

    This indicates what that subquery compressed into.

23.4.7. Log Entries - Adding Objects to Sets

These entries will be seen on an origin node's logs at the time you are configuring a replication set; some of them will be seen on subscribers at subscription time.

  • ERROR: Slony-I: setAddTable_int(): table % has no index %

    Apparently a PK index was specified that is absent on this node...

  • ERROR: Slony-I setAddTable_int(): table % not found

    Table wasn't found on this node; did you load the schema in properly?.

  • ERROR: Slony-I setAddTable_int(): table % is not a regular table

    You tried to replicate something that isn't a table; you can't do that!

  • NOTICE: Slony-I setAddTable_int(): table % PK column % nullable

    You tried to replicate a table where one of the columns in the would-be primary key is allowed to be null. All PK columns must be NOT NULL. This request is about to fail.

  • ERROR: Slony-I setAddTable_int(): table % not replicable!

    This happens because of the NULLable PK column.

  • ERROR: Slony-I setAddTable_int(): table id % has already been assigned!

    The table ID value needs to be assigned uniquely in SET ADD TABLE; apparently you requested a value already in use.

  • ERROR: Slony-I setAddSequence(): set % not found

    Apparently the set you requested is not available...

  • ERROR: Slony-I setAddSequence(): set % has remote origin

    You may only add things at the origin node.

  • ERROR: Slony-I setAddSequence(): cannot add sequence to currently subscribed set %

    Apparently the set you requested has already been subscribed. You cannot add tables/sequences to an already-subscribed set. You will need to create a new set, add the objects to that new set, and set up subscriptions to that.

  • ERROR: Slony-I setAddSequence_int(): set % not found

    Apparently the set you requested is not available...

  • ERROR: Slony-I setAddSequence_int(): sequence % not found

    Apparently the sequence you requested is not available on this node. How did you set up the schemas on the subscribers???

  • ERROR: Slony-I setAddSequence_int(): % is not a sequence

    Seems pretty obvious :-).

  • ERROR: Slony-I setAddSequence_int(): sequence ID % has already been assigned

    Each sequence ID added must be unique; apparently you have reused an ID.

23.4.8. Logging When Moving Objects Between Sets

  • ERROR: Slony-I setMoveTable_int(): table % not found

    Table wasn't found on this node; you probably gave the wrong ID number...

  • ERROR: Slony-I setMoveTable_int(): set ids cannot be identical

    Does it make sense to move a table from a set into the very same set?

  • ERROR: Slony-I setMoveTable_int(): set % not found

    Set wasn't found on this node; you probably gave the wrong ID number...

  • ERROR: Slony-I setMoveTable_int(): set % does not originate on local node

    Set wasn't found to have origin on this node; you probably gave the wrong EVENT NODE...

  • ERROR: Slony-I setMoveTable_int(): subscriber lists of set % and % are different

    You can only move objects between sets that have identical subscriber lists.

  • ERROR: Slony-I setMoveSequence_int(): sequence % not found

    Sequence wasn't found on this node; you probably gave the wrong ID number...

  • ERROR: Slony-I setMoveSequence_int(): set ids cannot be identical

    Does it make sense to move a sequence from a set into the very same set?

  • ERROR: Slony-I setMoveSequence_int(): set % not found

    Set wasn't found on this node; you probably gave the wrong ID number...

  • ERROR: Slony-I setMoveSequence_int(): set % does not originate on local node

    Set wasn't found to have origin on this node; you probably gave the wrong EVENT NODE...

  • ERROR: Slony-I setMoveSequence_int(): subscriber lists of set % and % are different

    You can only move objects between sets that have identical subscriber lists.

23.4.9. Issues with Dropping Objects

  • ERROR: Slony-I setDropTable(): table % not found

    Table wasn't found on this node; are you sure you had the ID right?

  • ERROR: Slony-I setDropTable(): set % not found

    The replication set wasn't found on this node; are you sure you had the ID right?

  • ERROR: Slony-I setDropTable(): set % has remote origin

    The replication set doesn't originate on this node; you probably need to specify an EVENT NODE in the SET DROP TABLE command.

  • ERROR: Slony-I setDropSequence_int(): sequence % not found

    Could this sequence be in another set?

  • ERROR: Slony-I setDropSequence_int(): set % not found

    Could you have gotten the set ID wrong?

  • ERROR: Slony-I setDropSequence_int(): set % has origin at another node - submit this to that node

    This message seems fairly self-explanatory...

23.4.10. Issues with MOVE SET, FAILOVER, DROP NODE

Many of these errors will occur if you submit a slonik script that describes a reconfiguration incompatible with your cluster's current configuration. Those will lead to the feeling: "Whew, I'm glad slonik caught that for me!"

Some of the others lead to a slon telling itself to fall over; all should be well when you restart it, as it will read in the revised, newly-correct configuration when it starts up.

Alas, a few indicate that "something bad happened," for which the resolution may not necessarily be easy. Nobody said that replication was easy, alas...

  • ERROR: Slony-I: DROP_NODE cannot initiate on the dropped node

    You need to have an EVENT NODE other than the node that is to be dropped....

  • ERROR: Slony-I: Node % is still configured as a data provider

    You cannot drop a node that is in use as a data provider; you need to reshape subscriptions so no nodes are dependent on it first.

  • ERROR: Slony-I: Node % is still origin of one or more sets

    You can't drop a node if it is the origin for a set! Use MOVE SET or FAILOVER first.

  • ERROR: Slony-I: cannot failover - node % has no path to the backup node

    You cannot failover to a node that isn't connected to all the subscribers, at least indirectly.

  • ERROR: Slony-I: cannot failover - node % is not subscribed to set %

    You can't failover to a node that doesn't subscribe to all the relevant sets.

  • ERROR: Slony-I: cannot failover - subscription for set % is not active

    If the subscription has been set up, but isn't yet active, that's still no good.

  • ERROR: Slony-I: cannot failover - node % is not a forwarder of set %

    You can only failover or move a set to a node that has forwarding turned on.

  • NOTICE: failedNode: set % has no other direct receivers - move now

    If the backup node is the only direct subscriber, then life is a bit simplified... No need to reshape any subscriptions!

  • NOTICE: failedNode set % has other direct receivers - change providers only

    In this case, all direct subscribers are pointed to the backup node, and the backup node is pointed to receive from another node so it can get caught up.

  • NOTICE: Slony-I: Please drop schema _@CLUSTERNAME@

    A node has been uninstalled; you may need to drop the schema...

23.4.11. Log Switching

These messages relate to the new-in-1.2 facility whereby Slony-I periodically switches back and forth between storing data in sl_log_1 and sl_log_2.

  • Slony-I: Logswitch to sl_log_2 initiated'

    Indicates that slon is in the process of switching over to this log table.

  • Slony-I: Logswitch to sl_log_1 initiated'

    Indicates that slon is in the process of switching over to this log table.

  • Previous logswitch still in progress

    An attempt was made to do a log switch while one was in progress...

  • ERROR: remoteWorkerThread_%d: cannot determine current log status

    The attempt to read from sl_log_status, which determines whether we're working on sl_log_1 or sl_log_2 got no results; that can't be a good thing, as there certainly should be data here... Replication is likely about to halt...

  • DEBUG2: remoteWorkerThread_%d: current local log_status is %d

    This indicates which of sl_log_1 and sl_log_2 are being used to store replication data.

23.4.12. Miscellanea

Perhaps these messages should be categorized further; that remains a task for the documentors.

  • ERROR: Slonik version: @MODULEVERSION@ != Slony-I version in PG build %

    This is raised in checkmoduleversion() if there is a mismatch between the version of Slony-I as reported by slonik versus what the PostgreSQL build has.

  • ERROR: Slony-I: registry key % is not an int4 value

    Raised in registry_get_int4(), this complains if a requested value turns out to be NULL.

  • ERROR: registry key % is not a text value

    Raised in registry_get_text(), this complains if a requested value turns out to be NULL.

  • ERROR: registry key % is not a timestamp value

    Raised in registry_get_timestamp(), this complains if a requested value turns out to be NULL.

  • NOTICE: Slony-I: cleanup stale sl_nodelock entry for pid=%

    This will occur when a slon starts up after another has crashed; this is routine cleanup.

  • ERROR: Slony-I: This node is already initialized

    This would typically happen if you submit STORE NODE gainst a node that has already been set up with the Slony-I schema.

  • ERROR: Slony-I: node % not found

    An attempt to mark a node not listed locally as enabled should fail...

  • ERROR: Slony-I: node % is already active

    An attempt to mark a node already marked as active as active should fail...

  • DEBUG4: remoteWorkerThread_%d: added active set %d to provider %d

    Indicates that this set is being provided by this provider.

  • DEBUG2: remoteWorkerThread_%d: event %d ignored - unknown origin

    Probably happens if events arrive before the STORE_NODE event that tells that the new node now exists...

  • WARN: remoteWorkerThread_%d: event %d ignored - origin inactive

    This shouldn't occur now (2006) as we don't support the notion of deactivating a node...

  • DEBUG2: remoteWorkerThread_%d: event %d ignored - duplicate

    This might be expected to happen if the event notification comes in concurrently from two sources...

  • DEBUG2: remoteWorkerThread_%d: unknown node %d

    Happens if the slon is unaware of this node; probably a sign of STORE_NODE requests not propagating...