Fri Nov 30 06:39:34 PST 2012
- Previous message: [Slony1-general] data copy for set 1 failed 3 times - sleep 60 seconds
- Next message: [Slony1-general] data copy for set 1 failed 3 times - sleep 60 seconds
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 12-11-29 11:23 PM, Tory M Blue wrote: > > Well this is frustrating. So I was successful in replicating a smaller > data set without issue. Once I try to replicate large amounts of data , > it seems to fail and restart, at what it feels the end of the biggest > table of each set. > > 2012-11-29 19:54:06 PST CONFIG remoteWorkerThread_1: 16341.883 seconds > to copy table "tracking"."spotimpressions" > 2012-11-29 19:54:06 PST CONFIG remoteWorkerThread_1: copy table > "tracking"."impressions" > 2012-11-29 19:54:06 PST CONFIG remoteWorkerThread_1: Begin COPY of table > "tracking"."impressions" > 2012-11-29 19:54:06 PST ERROR remoteWorkerThread_1: "select > "_admissioncls".copyFields(19);" > 2012-11-29 19:54:06 PST WARN remoteWorkerThread_1: data copy for set 2 > failed 1 times - sleep 15 seconds > > This large table ran for 4+ hours and the minute it starts with the very > next table, it "fails", Identical behavior when doing set 1 which has a > large table > > > 1235574-2012-11-29 12:22:12 PST CONFIG remoteWorkerThread_1: Begin COPY > of table "cls"."customers" > 1235665-2012-11-29 12:22:12 PST ERROR remoteWorkerThread_1: "select > "_admissioncls".copyFields(8);" > 1235759:2012-11-29 12:22:12 PST WARN remoteWorkerThread_1: data copy > for set 1 failed 1 times - sleep 15 seconds > Followed sometime later by this > 2012-11-29 12:22:28 PST DEBUG2 remoteWorkerThread_2: forward confirm > 3,5001168772 received by 4 > 2012-11-29 12:22:28 PST INFO copy_set 1 - omit=f - bool=0 > 2012-11-29 12:22:28 PST INFO omit is FALSE > > > So what's going on, it appears to have made it through the heavy > lifting, but it immediately goes to fail as it starts a much smaller > table. Why does it wait to make it through the largest table in the set, > before it says "bahh just kidding". > > AHHH interesting, yet again at the moment of "fail" a log switchover is > starting, this is identical to each and every failure. Why is a log > switch appearing right before every failure?! > > Can I disable this for a test? , disable logswitch You can disable/alter the log switch by making the cleanup interval in the slon on the master to be very big, longer than your tests/subscriptions take to run. (see cleanup_interval (interval) , on http://www.slony.info/documentation/2.1/slon-config-interval.html) This is for testing purposes I'n not recommending this as a solution. I also doubt this is the cause of your problem (but let us know if it does turn out to be that, because it means something is wrong, somewhere) You never did send me the output of: select "_admissioncls".copyFields(19); or equivilent from your master. You also never sent any information about the schema on the problem table. You might want to turn query logging on for the origin/provider node (at least for the slony user). This will tell us exactly what the SQL being executed is when the error occurs. Possibilities include: 1) copyFields() is still returning osmething bad, ie ')' for this table, so the SQL that later gets executed is COPY ()) FROM "tracking"."impressions"; or some other bad SQL in the copy. 2) The connection is actually aborting during the copy for connection related reasons. In the past people have reported issues where their firewall resets, connections after x minutes. We've also in the past had issues with openssl where some limit was reached and the connection was killed due to an openssl issue. 3) Something else > > 2012-11-29 19:54:11 PST admissionclsdb postgres [local] NOTICE: > Slony-I: Logswitch to sl_log_2 initiated > 2012-11-29 19:54:11 PST admissionclsdb postgres [local] CONTEXT: SQL > statement "SELECT "_admissioncls".logswitch_start()" > PL/pgSQL function "cleanupevent" line 96 at PERFORM > > 2012-11-29 12:22:13 PST admissionclsdb postgres [local] NOTICE: > Slony-I: Logswitch to sl_log_2 initiated > 2012-11-29 12:22:13 PST admissionclsdb postgres [local] CONTEXT: SQL > statement "SELECT "_admissioncls".logswitch_start()" > PL/pgSQL function "cleanupevent" line 96 at PERFORM > > " > > Thanks > Tory > > > _______________________________________________ > Slony1-general mailing list > Slony1-general at lists.slony.info > http://lists.slony.info/mailman/listinfo/slony1-general >
- Previous message: [Slony1-general] data copy for set 1 failed 3 times - sleep 60 seconds
- Next message: [Slony1-general] data copy for set 1 failed 3 times - sleep 60 seconds
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list