Wed Sep 26 09:09:23 PDT 2007
- Previous message: [Slony1-hackers] Re: XID in PG core/contrib
- Next message: [Slony1-hackers] EXECUTE SCRIPT and deadlocks
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello all, I've been encountering problems when trying to run a simple schema change through the EXECUTE SCRIPT command, because it resulted in a deadlock avoided by PostgreSQL. This wasn't unexpected since the documentation clearly states that EXECUTE SCRIPT involves locking all the tables of the specified set, and that such problems may arise. We know how to work around this, either by temporarily shutting down the application connecting to the database, or by removing the table from the replication set, running the changes on it, creating a new set, adding the said table to this set, subscribe it to slaves, and merge it back in the main set. Both ways are however impractical for us, and we're trying to find a better solution, that doesn't involve splitting the sets either. So far, we have considered a few options that I'm going to list below. Solution 1) Adding the possibility to provide another locking order, that is more likely to succeed than the one given by the object id (tab_id) in sl_table. This can be done in two different ways. 1.a) Adding a new column, say, tab_prio (NOT NULL DEFAULT tab_id) to the sl_table table, that specifies a priority to use for locking in EXECUTE SCRIPT. In SQL terms, this means changing an "ORDER BY tab_id" into "ORDER BY tab_prio, tab_id", when getting the list of the tables to lock / deal with. 1.b) Add a new option to the EXECUTE SCRIPT slonik command, allowing to pass a list of table IDs for specifying the locking order. Pros: * Both ways of implementing this option are fairly simple and don't involve a lot of changes in the source code as far as I can see, although option 1.b requires modifying slonik while option 1.a doesn't. Cons: * While I can't give an example, I believe that specifying an alternative locking order is not sufficient to solve the problem in all cases, plus it may be too complex to determine for databases containing lots of tables. Solution 2) Expanding on the previous idea, we could add an optional parameter to the EXECUTE SCRIPT command, for instance "ONLY FOR = <IDs list>". This subsumes option 1 because it allows to reorder the locking (tables would be locked in the order their IDs are passed), but also allows to avoid locking and calling of the alterTableForReplication() and alterTableRestore() functions, for those tables where we know for sure that it isn't required. Pros: * This is more generic than option 1, and I believe it should allow people to solve situations where determining a correct locking order is very complex, or even impossible (can that happen?). Cons: * This is an unsafe option, that could potentially lead to breakage of the replication setup if used incorrectly. I reckon, however, that it certainly wouldn't be the only slonik command where this is possible, and slonik is a fairly low-level interface anyways. It should however be clearly documented as such. Solution 3) I believe that in many "simple" cases (think adding a column on a table without even a DEFAULT), all the locking done by Slony-1 isn't really required. So we could imagine having a simplified EXECUTE SCRIPT command, that would only allow "simple" SQL queries, that don't require the whole locking dance. Pros: * If this is doable, it would probably be very practical for many users who mostly need to propagate such simple schema changes. Cons: * It may prove very hard to implement correctly. A variation on solution 3 would be to implement this simplified behaviour of EXECUTE SCRIPT by letting the user decide whether it is reasonable to do it this way or not. Needless to say, this makes the implementation of it much easier, but it also makes the command more dangerous. I am of course interested in hearing about the opinions of the Slony-1 developers and users, in order to determine if these suggestions are reasonable or not. If there's a super easy way to do what I want that I missed, I am interested too :-). I am willing to work on this stuff on behalf of my company, if the Slony-1 developers agree that implementing one of these options is a good idea. Cheers, Maxime
- Previous message: [Slony1-hackers] Re: XID in PG core/contrib
- Next message: [Slony1-hackers] EXECUTE SCRIPT and deadlocks
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-hackers mailing list