[Slony1-general] recovery

Fri Oct 15 01:45:47 PDT 2004

Thank you! That is very helpful. I had seen the MOVE SET command, but
hadn't connected it with what we need to do. 

We still need slonik FAILOVER for catastrophes, but MOVE SET seems
perfect for some of the maintenance use-cases we have to support.

Thanks again!

- DAP

>-----Original Message-----
>From: Christopher Browne [mailto:cbbrowne at ca.afilias.info] 
>Sent: Thursday, October 14, 2004 6:18 PM
>To: David Parker
>Cc: Slony Mailing List
>Subject: Re: [Slony1-general] recovery
>
>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>David Parker wrote:
>| Given 2 nodes A and B in a slony replication cluster, with A as the 
>| provider for a single set, if at some point I need to failover to B, 
>| what are my options for eventually returning A to provider status? 
>| Once we've failed over to B, we eventually want to get A 
>back to being 
>| the main guy.
>
>Are you thinking of "FAILOVER", or just moving the set's 
>"master" location?
>
>The Slonik FAILOVER command essentially discards the failed 
>node, at which point the only way to add it back involves 
>rebuilding it from scratch.
>
>If you can instead use MOVE SET to move the "master" provider 
>to the new node, that is a much more benign event.  The 
>"former master" can continue to participate in replication, 
>and that is very much preferable.
>
>| It looks like the first step is to set up a fresh schema in A, and 
>| subscribe it to B. I thought at first I would want to populate it 
>| first with a pg_dump from B, but I see that the ENABLE_SUBSCRIPTION 
>| logic truncates the target table before copying to it.
>
>Slony-I does this itself so that it can be certain that 
>nothing gets in edgewise to corrupt the data.
>
>| Will simply subscribing the new A to B result in all existing B data 
>| geting copied over? If so, how does this compare 
>performance-wise with 
>| using pg_dump? (I realize this comparison is apples vs. oranges, and 
>| really a comparision between table copy and pg_dump, hence not 
>| slony-specific, but if anybody has any experience with this I'd 
>| appreciate hearing about it).
>
>The "delete from table; copy table from stdin;" approach is 
>definitely not nearly as fast as using pg_dump.  It is 
>conspicuously slower because this approach involves populating 
>the indices all in parallel, as entries are added to the 
>table, which is _way_ slower than creating them ~ at the end, 
>in one fell swoop.
>
>| Assuming I'm able to fully load A with data from B, is it 
>possible to 
>| switch them such that A becomes the provider and B goes back to 
>| receive mode? This is complicated if updates are continuing to B, of 
>| course, but I wonder if it is possible at all.
>
>Part of the point of the "complications" in Slony-I is to 
>allow you to do exactly what you are talking about.
>
>The Slonik command MOVE SET is intended to do exactly what you 
>are asking about.  The nodes need to be locked, for a little 
>while, in order to allow A to catch up with B; once A has 
>caught up, it can take over as "master", and B will become the "slave."
>
>We have done this on one of our production systems; we found 
>that there was a problem with a "master" server, and therefore 
>used MOVE SET to switch over to a secondary server to give 
>opportunity to take the former ~ "main" node out of service 
>for maintenance.  We haven't switched back, but we could have 
>using another MOVE SET operation.
>
>| Finally, what query would I execute against 2 nodes to know if their 
>| data is in sync (at that moment)?
>
>Some of our systems are sufficiently busy that the result of 
>such a query wouldn't be accurate long enough for someone to 
>be able to read it :-).
>
>I have set up some replication tests to see how close nodes 
>are; it's essentially based on looking at an application table 
>and looking at the latest activity from the _application's_ 
>perspective.  Since my application differs from your 
>application, my queries would be useless to you :-(.
>
>There's a new view in 1.0.3 called sl_status that may be of 
>some value...
>
>~ -- 
>----------------------------------------------------------------------
>- -- VIEW sl_status
>- --
>- --	This view shows the local nodes last event sequence number
>- --	and how far all remote nodes have processed events.
>- -- 
>----------------------------------------------------------------------
>create or replace view @NAMESPACE at .sl_status as select
>	E.ev_origin as st_origin,
>	C.con_received as st_received,
>	E.ev_seqno as st_last_event,
>	E.ev_timestamp as st_last_event_ts,
>	C.con_seqno as st_last_received,
>	C.con_timestamp as st_last_received_ts,
>	CE.ev_timestamp as st_last_received_event_ts,
>	E.ev_seqno - C.con_seqno as st_lag_num_events,
>	current_timestamp - CE.ev_timestamp as st_lag_time
>	from @NAMESPACE at .sl_event E, @NAMESPACE at .sl_confirm C,
>		@NAMESPACE at .sl_event CE
>	where E.ev_origin = C.con_origin
>	and CE.ev_origin = E.ev_origin
>	and CE.ev_seqno = C.con_seqno
>	and (E.ev_origin, E.ev_seqno) in
>		(select ev_origin, max(ev_seqno)
>			from @NAMESPACE at .sl_event
>			where ev_origin = 
>@NAMESPACE at .getLocalNodeId('_ at CLUSTERNAME@')
>			group by 1
>		)
>	and (C.con_origin, C.con_received, C.con_seqno) in
>		(select con_origin, con_received, max(con_seqno)
>			from @NAMESPACE at .sl_confirm
>			where con_origin = 
>@NAMESPACE at .getLocalNodeId('_ at CLUSTERNAME@')
>			group by 1, 2
>		);
>comment on view @NAMESPACE at .sl_status is 'View showing how far 
>behind remote nodes are.
>';
>-----BEGIN PGP SIGNATURE-----
>Version: GnuPG v1.2.5 (GNU/Linux)
>Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
>
>iD8DBQFBbvsiCVn6LJfHIAIRAo4vAKCj3z9hI/o2beEy/C78muk0G0lQvACfZJD1
>TIjD2LExkoEldeCP3P+VTys=
>=/tq6
>-----END PGP SIGNATURE-----
>