Wed Sep 26 15:41:52 PDT 2007
- Previous message: [Slony1-bugs] Slony 1.2.10: Deadlock on slave during execute script
- Next message: [Slony1-bugs] Slony 1.2.10: Deadlock on slave during execute script
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 26 Sep 2007, Christopher Browne wrote: > Jeff Frost wrote: >> On Wed, 26 Sep 2007, Christopher Browne wrote: >> >>> Jeff Frost <jeff at frostconsultingllc.com> writes: >>>> For extra background on this bug, see the thread here: >>>> http://lists.slony.info/pipermail/slony1-general/2007-September/006687.html >>>> >>>> This is on Slony1-1.2.10, PostgreSQL-8.1.9: >>>> >>>> I've run into a situation on a client cluster composed of a master and >>>> 2 slave nodes where a deadlock on one of the slaves happens quite >>>> regularly during any EXECUTE SCRIT commands. It seems if slony loses >>>> the deadlock, some of the tables are left in a not altered for >>>> replication state and this breaks replication. >>> >>> I'm going to set up (heh) a test case, let's call it >>> "testdeadlockddl", which will try to "tickle" this problem. >>> >>> The approach: >>> >>> - I'll start by creating a set that is replicated, and where some data >>> is flowing thru. >>> >>> - I'll set up some queries against the replica in order to try to >>> encourage deadlocks. >>> >>> - Then an "EXECUTE SCRIPT" will try adding a new table. As a side effect, >>> it will implicitly require locks on all the replicated tables. >>> >>> I don't see that there are other particular details to this, right? >> >> You seem to be right on the money with my experience, Christopher. Perhaps >> if your slave is inside a VM making it a little less performant would be >> more likley to cause the deadlock and show the problem? I mention this >> because our slave is not as performant as the master and so it deadlocks >> more often during execute scripts than the master did when the master >> shoulders the entire load. >> > I have a test added in; it hasn't yet tickled any deadlocks, so I think I'm > not yet seeing the pattern of queries that you are using. > > I'll fight further with this tomorrow; if it's at all possible for you to > show a pattern of queries (not involving Slony-I necessarily at all) that > causes the deadlock to emerge, that would be helpful. > > This is NOT about system load - deadlocks can emerge under pretty simple > conditions, and the simpler the condition we can come up with, the better. A > good test case for this won't involve variations of load - it should be able > to remain pretty simple. The queries that we're seeing trigger it are SELECT queries with lots of INTERSECTs like this ugly one: SELECT m.name as manf , p.mfgno_stripped as mfgno , p.onhand , p.price , p.sdescr , pi.img , pm.categoryid INTO TEMP TABLE searchc8f9d43421c4d72570b9bb9288ff3c10 FROM products p JOIN man m ON (p.man_id = m.id) LEFT JOIN product_images pi ON ( p.mfgno_stripped = pi.mfgno_stripped AND p.man_id = pi.man_id ) LEFT JOIN category.product_map pm ON ( p.mfgno_stripped = pm.mfgno_stripped AND p.man_id = pm.manid ) WHERE ( (upper(p.mfgno_stripped) LIKE '%ACER%') OR (upper(m.name) LIKE '%ACER%') OR (upper(p.sdescr) LIKE '%ACER%') ) INTERSECT SELECT m.name as manf , p.mfgno_stripped as mfgno , p.onhand , p.price , p.sdescr , pi.img , pm.categoryid FROM products p JOIN man m ON (p.man_id = m.id) LEFT JOIN product_images pi ON ( p.mfgno_stripped = pi.mfgno_stripped AND p.man_id = pi.man_id ) LEFT JOIN category.product_map pm ON ( p.mfgno_stripped = pm.mfgno_stripped AND p.man_id = pm.manid ) WHERE ( (upper(p.mfgno_stripped) LIKE '%NOTEBOOK%') OR (upper(m.name) LIKE '%NOTEBOOK%') OR (upper(p.sdescr) LIKE '%NOTEBOOK%') ) INTERSECT SELECT m.name as manf , p.mfgno_stripped as mfgno , p.onhand , p.price , p.sdescr , pi.img , pm.categoryid FROM products p JOIN man m ON (p.man_id = m.id) LEFT JOIN product_images pi ON ( p.mfgno_stripped = pi.mfgno_stripped AND p.man_id = pi.man_id ) LEFT JOIN category.product_map pm ON ( p.mfgno_stripped = pm.mfgno_stripped AND p.man_id = pm.manid ) WHERE ( (upper(p.mfgno_stripped) LIKE '%1GB%') OR (upper(m.name) LIKE '%1GB%') OR (upper(p.sdescr) LIKE '%1GB%') ) INTERSECT SELECT m.name as manf , p.mfgno_stripped as mfgno , p.onhand , p.price , p.sdescr , pi.img , pm.categoryid FROM products p JOIN man m ON (p.man_id = m.id) LEFT JOIN product_images pi ON ( p.mfgno_stripped = pi.mfgno_stripped AND p.man_id = pi.man_id ) LEFT JOIN category.product_map pm ON ( p.mfgno_stripped = pm.mfgno_stripped AND p.man_id = pm.manid ) WHERE ( (upper(p.mfgno_stripped) LIKE '%120GB%') OR (upper(m.name) LIKE '%120GB%') OR (upper(p.sdescr) LIKE '%120GB%') ); Note that these queries are old and will be replaced with some tsearch2 functionality as soon as it makes its way through testing. I think the deadlocks aren't load related but speed related. That is, if the acquiring of all the locks by the execute script takes longer on a slower machine, the window of opportunity for one of these selects to cause a deadlock seems greater, no? They do seem to happen on the slower machine more regularly than the faster one. -- Jeff Frost, Owner <jeff at frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
- Previous message: [Slony1-bugs] Slony 1.2.10: Deadlock on slave during execute script
- Next message: [Slony1-bugs] Slony 1.2.10: Deadlock on slave during execute script
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-bugs mailing list