Tue May 25 12:09:26 PDT 2010
- Previous message: [Slony1-general] Table not replicating, but no errors reported by slony.
- Next message: [Slony1-general] Table not replicating, but no errors reported by slony.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I've run the slonik script that did the sync, and it finished with a result of 0. The number of rows remain unchanged, the slave being less than the master. I've generated a list of rows that the master has that the slave does not, and we're doing some investigating to see if it was somthing that someone had done, or something else. - Brian Fehrle Brian Fehrle wrote: > Steve Singer wrote: > >> Brian Fehrle wrote: >> >> >>> Hi all, >>> >> A few things I would look at >> >> Look at sl_event and sl_confirm. Are there events in sl_event that >> are larger than what shows up as being confirmed in sl_confirm? When >> where these events generated? If so look at the next unconfirmed >> event in sl_event and see what type of event it is. >> >> > All entries between sl_event and sl_confirm match exactly (each > con_seqno from sl_confirm matches an ev_seqno from sl_event) > >> You can look at sl_log_1 and sl_log_2, you should see your missing >> rows, in particular the ev_snapshot from sl_event of the last >> unconfirmed SYNC event should give you the range of rows (log_txid) of >> some of the unreplicated rows. The set of all of the unconfirmed sync >> events should give you all of the rows in sl_log_1 and sl_log_2 that >> need to still be sent. >> >> > On both the master and the slave, there are zero entries in either > sl_log_1 or sl_log_2. > >> You can also try a slonik script like >> >> sync(id=1); >> wait for event(origin=1, confirmed=all, wait on=1); >> >> >> This generate a sync event and wait until it gets replicated. If >> slonik exists on success and your still missing those rows then >> something strange is going on (I would start to wonder if you did >> something like an execute script on your replica that deleted rows >> just from the replica) >> > I was wondering the same thing, however doesn't the slave node refuse > updates/inserts/deletes via a locking system? There are quite a few > people who use the databases and I can't account for all actions by > everyone. I will give this sync command a try in a bit, I need to wait > on some things before I can give it a try. > > Another thing that has come to mind, when we first added this table to > the replication set, we had a few problems with some of our scrips which > resulted in a daemon attempting to start the slon daemons even if they > were already running. Normally the daemons are smart enough to kill > themselves, however since this was going on during the initial > propagation of the data to the slave, it may have done something > unintentional. > > - Brian > >> Steve >> >> >> >> >> >> >>> I'm having some trouble determining why replication isn't >>> happening on a replication table. I have a two node slony cluster. I >>> have a table in the slony replication set that has 72332 records on >>> the master, however it has 71225 records on the slave. It's been this >>> way for a few hours at least (could be more as that is when we first >>> noticed it). This table was added to the replication set several >>> weeks ago, so it's not stalled mid-publish. The slon daemons are >>> running, and the logs for the daemons report no abnormalities. I've >>> restarted the slon daemons to see if it would clear anything up, but >>> it remains the same. >>> >>> Looking at sl_status, the lag events never go above 1, and the lag >>> time never goes above a couple of minutes. >>> >>> Best reasons I can think of are, either something is causing the >>> replication on this particular table to be on "hold" and not update >>> the remaining rows on the slave, while not alerting me via the slon >>> logs. Or something went screwy and replication for that table is out >>> of sync and I need to drop the table from the set and add it back >>> again, let it sync up (however this solution is not ideal.) >>> >>> Any tips of places I should look to see what may be going on? >>> >>> Thanks in advance. >>> >>> - Brian Fehrle >>> >>> Data that may be important: >>> >>> Commands that start the slon daemons: >>> /usr/local/pgsql/bin/slon -p /usr/local/pgsql/log/slon.node1.pid -s >>> 60000 -t 300000 SLONY "dbname=$MASTERDBNAME port=$MASTERPORT >>> host=$MASTERHOST user=$REPUSER" > >>> /usr/local/pgsql/log/slon.node1.log 2>&1 & >>> /usr/local/pgsql/bin/slon -p /usr/local/pgsql/log/slon.node2.pid-s >>> 60000 -a /usr/local/pgsql/slon_logs -t 300000 -x "log_parsing_script" >>> SLONY "dbname=$SLAVEDBNAME port=$SLAVEPORT host=$SLAVEHOST >>> user=$REPUSER" > /usr/local/pgsql/log/slon.node2.log 2>&1 & >>> >>> slony version 1.2.20 >>> master PostgreSQL version 8.4.1 >>> slave PostgreSQL version 8.4.2 >>> >>> >>> _______________________________________________ >>> Slony1-general mailing list >>> Slony1-general at lists.slony.info >>> http://lists.slony.info/mailman/listinfo/slony1-general >>> >> > > _______________________________________________ > Slony1-general mailing list > Slony1-general at lists.slony.info > http://lists.slony.info/mailman/listinfo/slony1-general >
- Previous message: [Slony1-general] Table not replicating, but no errors reported by slony.
- Next message: [Slony1-general] Table not replicating, but no errors reported by slony.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list