Wed Nov 18 08:57:25 PST 2009
- Previous message: [Slony1-commit] slony1-engine/src/backend slony1_funcs.sql
- Next message: [Slony1-commit] slony1-engine/src/slonik slonik.c
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Update of /home/cvsd/slony1/slony1-engine/src/slonik
In directory main.slony.info:/tmp/cvs-serv30577/slonik
Modified Files:
Tag: REL_2_0_STABLE
slonik.c
Log Message:
Bug #102
http://www.slony.info/bugzilla/show_bug.cgi?id=102
If you have a configuration with 1 master and 2 or more slaves, and you have
all paths defined, slonik can hang during a failover.
In src/slonik/slonik.c, the slonik_failed_node function queries the sl_nodelock
table on each node to find the listener process responsible for that node and
stores it in nodeinfo.
Later in the function it loops through all the nodes, checking to see if the
listener responsible for that node has exited so it knows that slon has
restarted on that node.
Unfortunately, the query it uses just counts the number of processes that are
not that original pid. It obviously is expecting only one entry in the result
set (i.e., the replacement listener). If there are other listeners for other
nodes running on that node (as is the case when a second slave has a path
defined, for example), then that query may never return exactly 1.
The fix is to add the node number to the query, since then it finds the number
of processes that aren't the old listener that are assigned to that node. When
there is exactly 1 of those then the slon has restarted.
Per Michael Lee Squires @ whitepages.com
Index: slonik.c
===================================================================
RCS file: /home/cvsd/slony1/slony1-engine/src/slonik/slonik.c,v
retrieving revision 1.91.2.6
retrieving revision 1.91.2.7
diff -C2 -d -r1.91.2.6 -r1.91.2.7
*** slonik.c 23 Sep 2009 16:22:30 -0000 1.91.2.6
--- slonik.c 18 Nov 2009 16:57:23 -0000 1.91.2.7
***************
*** 2619,2625 ****
slon_mkquery(&query,
"select nl_backendpid from \"_%s\".sl_nodelock "
! " where nl_backendpid <> %d; ",
stmt->hdr.script->clustername,
! nodeinfo[i].slon_pid);
res1 = db_exec_select((SlonikStmt *) stmt, nodeinfo[i].adminfo, &query);
if (res1 == NULL)
--- 2619,2629 ----
slon_mkquery(&query,
"select nl_backendpid from \"_%s\".sl_nodelock "
! " where nl_backendpid <> %d "
! " and nl_nodeid = \"_%s\".getLocalNodeId('_%s');",
stmt->hdr.script->clustername,
! nodeinfo[i].slon_pid,
! stmt->hdr.script->clustername,
! stmt->hdr.script->clustername
! );
res1 = db_exec_select((SlonikStmt *) stmt, nodeinfo[i].adminfo, &query);
if (res1 == NULL)
- Previous message: [Slony1-commit] slony1-engine/src/backend slony1_funcs.sql
- Next message: [Slony1-commit] slony1-engine/src/slonik slonik.c
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-commit mailing list