Jeff Frost jeff at frostconsultingllc.com
Tue Sep 18 13:49:28 PDT 2007
Hi guys, I've got an interesting situation on a slony 1.2.10 3 node cluster. 
Both slaves get their data direct from the master.  Everything has been 
running well up to a few days ago.  Now every time we try to add a new table 
to the cluster, we end up with the following error:

2007-09-18 15:56:06 EDT ERROR  remoteWorkerThread_1: "select 
"_nerdcluster".ddlScript_prepare_int(1, -1); " PGRES_FATAL_ERROR ERROR: 
Slony-I: alterTableRestore(): Table "public"."carts" is not in altered state
CONTEXT:  SQL statement "SELECT  "_nerdcluster".alterTableRestore( $1 )"
PL/pgSQL function "ddlscript_prepare_int" line 46 at perform

It looks like the problem is being caused by a deadlock:

15:27:54 sql1 slon[12252]: [39-1] 2007-09-18 15:27:54 EDT ERROR 
remoteWorkerThread_1: "select "_nerdcluster".ddlScript
_complete_int(1, -1); " PGRES_FATAL_ERROR
Sep 18 15:27:54 sql1 slon[12252]: [39-2]  ERROR:  deadlock detected
Sep 18 15:27:54 sql1 slon[12252]: [39-3] DETAIL:  Process 12263 waits for 
AccessExclusiveLock on relation 121589880 of databas
e 121589046; blocked by process 12096.
Sep 18 15:27:54 sql1 slon[12252]: [39-4] Process 12096 waits for 
AccessShareLock on relation 121589817 of database 121589046;
blocked by process 12263.

So, my theory is that the execute script alters the tables back to their 
normal states, doesn't get all the locks it wants and bails out without 
putting them back to their previously altered state, thus breaking 
replication.

So, is there a reasonable way to fix this without droppping/resubscribing the 
node?

---
Jeff Frost, Owner 	 <jeff at frostconsultingllc.com>
Frost Consulting, LLC 	http://www.frostconsultingllc.com/
Phone: 650-780-7908	FAX: 650-649-1954


More information about the Slony1-general mailing list