Tue Jun 2 13:17:19 PDT 2009
- Previous message: [Slony1-general] How to shut-down slony replication
- Next message: [Slony1-general] How to rename a database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jeff Frost wrote:
> Andrew Sullivan wrote:
>> On Tue, Jun 02, 2009 at 10:30:45AM -0500, Sean Staats wrote:
>> =
>>> I created a new replication cluster. It turns out that starting the =
>>> table IDs at id=3D1 and the sequence IDs at id=3D1001 didn't make any =
>>> difference as slony gave me the same error (sequence ID 1001 has alread=
y =
>>> been assigned.) Increasing the log verbosity to 4 doesn't produce any =
>>> more useful debugging information. Time for another approach.
>>>
>>> Would it make sense to create 2 different sets - one to replicate the =
>>> tables and one to replicate the sequences? Is there a downside to this=
=
>>> kind of workaround?
>>> =
>>
>> It'd be better to figure out what the duplication is caused by. Have
>> a look in the _slony tables and check to see what's in there. Where's
>> the collision?
>>
>> =
> I've seen this issue recently when the initial sync fails. If you
> scroll further back in your logs do you have a failure for the initial
> copy_set? When this happens to me, it seems that slony leaves the
> slave DB in a half replicated state, but reattempts to do the initial
> sync and finds that the sequences are already in _cluster.sl_sequence
> table, then errors out. This requires dropping the node and starting
> over. This is with version 1.2.16. I recall previous versions being
> able to recover from a failed initial sync without intervention, but
> my memory could be mistaken.
In fact, here's how it looks in my logs:
Jun 2 13:09:36 localhost slon[1867]: [274-1] 2009-06-02 13:09:36 PDT
ERROR remoteWorkerThread_1: "select
"_engage_cluster".tableHasSerialKey('"archive"."invitation"');"
Jun 2 13:09:36 localhost slon[1867]: [274-2] could not receive data
from server: Connection timed out
Jun 2 13:09:36 localhost slon[1867]: [275-1] 2009-06-02 13:09:36 PDT
WARN remoteWorkerThread_1: data copy for set 1 failed - sleep 30 seconds
Jun 2 13:09:36 localhost postgres[1880]: [26-1] NOTICE: there is no
transaction in progress
Jun 2 13:10:06 localhost slon[1867]: [276-1] 2009-06-02 13:10:06 PDT
DEBUG1 copy_set 1
Jun 2 13:10:06 localhost slon[1867]: [277-1] 2009-06-02 13:10:06 PDT
DEBUG1 remoteWorkerThread_1: connected to provider DB
Jun 2 13:10:09 localhost slon[1867]: [278-1] 2009-06-02 13:10:09 PDT
ERROR remoteWorkerThread_1: "select
"_engage_cluster".setAddSequence_int(1, 4,
Jun 2 13:10:09 localhost slon[1867]: [278-2] =
'"public"."tracking_sequence"', 'public.tracking_sequence sequence')"
PGRES_FATAL_ERROR ERROR: Slony-I: setAddSequence_int():
Jun 2 13:10:09 localhost slon[1867]: [278-3] sequence ID 4 has already
been assigned
Jun 2 13:10:09 localhost slon[1867]: [279-1] 2009-06-02 13:10:09 PDT
WARN remoteWorkerThread_1: data copy for set 1 failed - sleep 60 seconds
The DB in question is 144GB and it's being replicated over a relatively
slow link. It seems to do about 1GB/hr, but never gets past 10GB. It
always dies at that same point. =
-- =
Jeff Frost, Owner <jeff at frostconsultingllc.com>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 916-647-6411 FAX: 916-405-4032
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20090602/=
99bc85a5/attachment.htm
- Previous message: [Slony1-general] How to shut-down slony replication
- Next message: [Slony1-general] How to rename a database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list