Fri May 4 14:46:26 PDT 2012
- Previous message: [Slony1-general] Logship files printing incorrectly
- Next message: [Slony1-general] Logship files printing incorrectly
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, May 2, 2012 at 2:39 PM, Steve Singer <ssinger at ca.afilias.info>wrote: > Are any of the above possible: > > 1. You had multiple slon daemons writing to the same log archive directory > (maybe for different clusters?) > We have several clusters writing to a directory, but there's a separate directory for each cluster. For example: /home/log_ship /home/log_ship/cluster1/new_logfiles /home/log_ship/cluster2/new_logfiles /home/log_ship/cluster3/new_logfiles ...etc... We don't have two daemons writing to the same directory. 2. The mechanism you used for copying the .sql files could have caused > processes to try to write to the same file on the destination machine > I'm fairly certain this is not the case. The files that I sent you were directly from the origin machine, not from the destination machine. Our scheme is like this: Node1 is origin Node2 is subscriber, with -a mode, writing files to /home/log_ship/cluster1/new_logfiles Cronjob moves files from /home/log_ship/cluster1/new_logfiles to /home/log_ship/cluster1/log_staging (we filter out the *.sql.tmp files so that we can let them finish writing before we move them) RemoteNode makes rsync connection to Node2 and copies the files from Node2/home/log_ship/cluster1/log_staging to its local directory Log files are replayed > If the answer to both of those is no then maybe there is a bug in how > archive file numbers are assigned in remote_worker.c:archive_open. > We don't YET see any obvious faults with this logic but if this logic > somehow assigned 2 slon worker threads the same id then you could get a > file like you sent us. > As I look at the files you sent me, I only see differences between the third (Node X, Event XXXXXX) and seventh (archiveTracking_offline(xxx,'xxxx-xx-xx xx:xx:xx')) lines. I noticed that Node number can vary per file, but only one daemon has the -a option enabled. Not sure why the node number changes--shouldn't it always correspond to the node number of the daemon with the -a option turned on? Aside from that, I tried poking around the sl_* tables that I had dumped, but didn't really find anything. One thing is certain, though--a given DML statement shows up in sl_log_x only once, even though it shows up several times in the various logship files that are generated. I can't seem to find the corresponding sl_event row, so I'm not sure if there might be anything in that direction, in terms of duplicated events. --Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.slony.info/pipermail/slony1-general/attachments/20120504/896a2a0c/attachment.htm
- Previous message: [Slony1-general] Logship files printing incorrectly
- Next message: [Slony1-general] Logship files printing incorrectly
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list