Steve Singer ssinger at ca.afilias.info
Mon Mar 22 19:28:36 PDT 2010
Hernan Saltiel wrote:
> Hi!
> I configured a slony cluster between two nodes: the master, srvdb01, and 
> a slave, srvdb02. The database is "dbprod".
> Both nodes are CentOS 64 bits, with this postgres packages installed:
> create set (id = 1, origin = 1,
> comment = 'Base Productiva');
> 
> (All the set's are here, are more than 120...)
> 

You never mentioned where you added tables to your sets.  Could you have 
120 replication sets with 0 tables in each?

Does
SELECT * FROM _mycluster.sl_table;

show you anything interesting? (is it empty, meaning your sets seem to 
have no tables?)

Did you also issue 120 subscribe set requests or did you only subscribe 
the first one? (If you tried subscribing all 120 at once you might want 
to try and tear-down the slony cluster and try it again only doing the 
first set and waiting for it to finish before moving on.  It is possible 
there are some race conditions that result from trying to subscribe to 
multiple sets concurrently)

You should also check to see if there are any locks being held on slony 
tables.





> store node (id = 2, comment = 'Node 2');
> store path (server = 1, client = 2,
> conninfo = 'dbname=$DB1 host=$H1 user=$U password=$P');
> store path (server = 2, client = 1,
> conninfo = 'dbname=$DB2 host=$H2 user=$U password=$P');
> store listen (origin = 1, provider = 1, receiver = 2);
> store listen (origin = 2, provider = 2, receiver = 1);
> 
> Then, executed the script.
> 
> On the master and slave nodes, I ran:
> nohup slon dbprod_cluster "dbname=dbprod user=postgres" &
> 
> After that, created the subscribe.sh script, on the slave node:
> 
> #!/bin/sh
> 
> CLUSTER=dbprod_cluster
> DB1=dbprod
> DB2=dbprod
> H1=srvdb01
> H2=srvdb02
> U=postgres
> P=Secreta01
> 
> slonik <<_EOF_
> 
> cluster name = $CLUSTER;
> 
> node 1 admin conninfo = 'dbname=$DB1 host=$H1 user=$U password=$P';
> node 2 admin conninfo = 'dbname=$DB2 host=$H2 user=$U password=$P';
> 
> subscribe set (id = 1, provider = 1, receiver = 2, forward = yes);
> 
> I ran that script, and saw in the nohup.out log file of the slon process 
> several SYNC, LISTEN and UNLISTEN messages.
> I'm concerned, after two days seeing those messages, and not seeing any 
> row being replicated, if this is normal, because Slony needs to do 
> something before start replicating, or if there is some way to 
> understand if something is going wrong.
> 
> Here are some rows of the master nohup.out file:
> 
> DEBUG2 remoteWorkerThread_2: SYNC 30755 processing
> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event
> DEBUG2 syncThread: new sl_action_seq 11392 - SYNC 16232
> DEBUG2 remoteListenThread_2: queue event 2,30756 SYNC
> DEBUG2 remoteListenThread_2: queue event 2,30757 SYNC
> DEBUG2 remoteWorkerThread_2: Received event 2,30756 SYNC
> DEBUG2 calc sync size - last time: 1 last length: 8611 ideal: 6 proposed 
> size: 3
> DEBUG2 remoteWorkerThread_2: SYNC 30757 processing
> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event
> DEBUG2 localListenThread: Received event 1,16232 SYNC
> DEBUG2 syncThread: new sl_action_seq 11392 - SYNC 16233
> DEBUG2 remoteListenThread_2: queue event 2,30758 SYNC
> DEBUG2 remoteWorkerThread_2: Received event 2,30758 SYNC
> DEBUG2 calc sync size - last time: 2 last length: 8525 ideal: 14 
> proposed size: 5
> DEBUG2 remoteWorkerThread_2: SYNC 30758 processing
> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event
> DEBUG2 remoteListenThread_2: queue event 2,30759 SYNC
> DEBUG2 remoteWorkerThread_2: Received event 2,30759 SYNC
> DEBUG2 calc sync size - last time: 1 last length: 2389 ideal: 25 
> proposed size: 3
> DEBUG2 remoteWorkerThread_2: SYNC 30759 processing
> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event
> DEBUG2 localListenThread: Received event 1,16233 SYNC
> DEBUG2 syncThread: new sl_action_seq 11392 - SYNC 16234
> DEBUG2 localListenThread: Received event 1,16234 SYNC
> DEBUG2 remoteListenThread_2: queue event 2,30760 SYNC
> DEBUG2 remoteListenThread_2: queue event 2,30761 SYNC
> DEBUG2 remoteWorkerThread_2: Received event 2,30760 SYNC
> DEBUG2 calc sync size - last time: 1 last length: 8570 ideal: 7 proposed 
> size: 3
> DEBUG2 remoteWorkerThread_2: SYNC 30761 processing
> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event
> DEBUG2 syncThread: new sl_action_seq 11392 - SYNC 16235
> DEBUG2 remoteListenThread_2: queue event 2,30762 SYNC
> DEBUG2 remoteWorkerThread_2: Received event 2,30762 SYNC
> DEBUG2 calc sync size - last time: 2 last length: 8519 ideal: 14 
> proposed size: 5
> DEBUG2 remoteWorkerThread_2: SYNC 30762 processing
> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event
> DEBUG2 remoteListenThread_2: queue event 2,30763 SYNC
> DEBUG2 remoteWorkerThread_2: Received event 2,30763 SYNC
> DEBUG2 calc sync size - last time: 1 last length: 2350 ideal: 25 
> proposed size: 3
> DEBUG2 remoteWorkerThread_2: SYNC 30763 processing
> DEBUG2 remoteWorkerThread_2: no sets need syncing for this event
> DEBUG2 localListenThread: Received event 1,16235 SYNC
> 
> 
> ...and here some of the slave:
> 
> DEBUG2 localListenThread: Received event 2,30773 SYNC
> DEBUG2 remoteListenThread_1: LISTEN
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30774
> DEBUG2 localListenThread: Received event 2,30774 SYNC
> DEBUG2 remoteListenThread_1: queue event 1,16241 SYNC
> DEBUG2 remoteListenThread_1: UNLISTEN
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30775
> DEBUG2 localListenThread: Received event 2,30775 SYNC
> DEBUG2 remoteListenThread_1: LISTEN
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30776
> DEBUG2 localListenThread: Received event 2,30776 SYNC
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30777
> DEBUG2 remoteListenThread_1: queue event 1,16242 SYNC
> DEBUG2 remoteListenThread_1: UNLISTEN
> DEBUG2 localListenThread: Received event 2,30777 SYNC
> DEBUG2 remoteListenThread_1: LISTEN
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30778
> DEBUG2 localListenThread: Received event 2,30778 SYNC
> DEBUG2 remoteListenThread_1: queue event 1,16243 SYNC
> DEBUG2 remoteListenThread_1: UNLISTEN
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30779
> DEBUG2 localListenThread: Received event 2,30779 SYNC
> DEBUG2 remoteListenThread_1: LISTEN
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30780
> DEBUG2 localListenThread: Received event 2,30780 SYNC
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30781
> DEBUG2 remoteListenThread_1: queue event 1,16244 SYNC
> DEBUG2 remoteListenThread_1: UNLISTEN
> DEBUG2 localListenThread: Received event 2,30781 SYNC
> DEBUG2 remoteListenThread_1: LISTEN
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30782
> DEBUG2 localListenThread: Received event 2,30782 SYNC
> DEBUG2 remoteListenThread_1: queue event 1,16245 SYNC
> DEBUG2 remoteListenThread_1: UNLISTEN
> DEBUG2 syncThread: new sl_action_seq 1 - SYNC 30783
> DEBUG2 localListenThread: Received event 2,30783 SYNC
> 
> I ran some scripts in the _dbprod_cluster view, because of some tips I 
> found on blog's, but don't really know if this is an indicator of 
> something going normally, or not.
> Here are some of them:
> 
> select count(*) from _dbprod_cluster.sl_log_1;
> 
>  count
> -------
>  11392
> (1 row)
> 
> select count(*) from _dbprod_cluster.sl_log_2;
> 
>  count
> -------
>      0
> (1 row)
> 
> select st_lag_num_events from _dbprod_cluster.sl_status;
> 
>  st_lag_num_events
> -------------------
>              16130
> (1 row)
> 
> Could anybody help me understand what this numbers are telling me?
> Thanks a lot in advance for your help!!!!
> Best regards,
> 
> -- 
> HeCSa
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general


-- 
Steve Singer
Afilias Canada
Data Services Developer
416-673-1142


More information about the Slony1-general mailing list