[Slony1-general] sl_nodelock messages

Wed Jul 16 12:17:33 PDT 2008

Hi all,

I have found out something new regarding my sl_nodelock messages. The
following error occurred and also my slony pids for some nodes where
different after this error message.

2008-07-16 20:15:07 CEST DEBUG2 remoteListenThread_2: LISTEN
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_4: forward confirm 4,7884
received by 6
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_4: forward confirm 4,7884
received by 7
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_4: forward confirm 5,7881
received by 9
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_2: forward confirm 5,7881
received by 3
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_2: forward confirm 4,7884
received by 5
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_2: forward confirm 4,7884
received by 9
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_3: forward confirm 4,7884
received by 3
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_3: forward confirm 4,7884
received by 8
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_3: forward confirm 1,15904
received by 3
2008-07-16 20:15:07 CEST DEBUG2 remoteListenThread_2: queue event 7,7890 SYNC
2008-07-16 20:15:07 CEST DEBUG2 remoteListenThread_2: UNLISTEN
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_7: Received event 7,7890
SYNC
2008-07-16 20:15:07 CEST DEBUG2 calc sync size - last time: 1 last length:
292 ideal: 205 proposed size: 3
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_7: SYNC 7890 processing
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_7: no sets need syncing
for this event
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_2: forward confirm 7,7890
received by 3
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_2: forward confirm 7,7890
received by 2
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_2: forward confirm 7,7890
received by 4
2008-07-16 20:15:07 CEST DEBUG2 remoteWorkerThread_2: forward confirm 7,7890
received by 8
2008-07-16 20:15:13 CEST ERROR  remoteListenThread_8: "select ev_origin,
ev_seqno, ev_timestamp,        ev_minxid, ev_maxxid, ev_xip,        ev_type,
ev_data1, ev_data2,        ev_data3, ev_data4,        ev_data5, ev_data6,
ev_data7, ev_data8 from "_slony1".sl_event e where (e.ev_origin = '7' and
e.ev_seqno > '7890') or (e.ev_origin = '8' and e.ev_seqno > '7885') or
(e.ev_origin = '2' and e.ev_seqno > '7884') or (e.ev_origin = '3' and
e.ev_seqno > '7880') or (e.ev_origin = '4' and e.ev_seqno > '7884') or
(e.ev_origin = '5' and e.ev_seqno > '7881') or (e.ev_origin = '6' and
e.ev_seqno > '7885') or (e.ev_origin = '9' and e.ev_seqno > '7895') order by
e.ev_origin, e.ev_seqno" - server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
2008-07-16 20:15:22 CEST DEBUG2 localListenThread: Received event 1,15904
SYNC
2008-07-16 20:15:22 CEST DEBUG2 syncThread: new sl_action_seq 1561499 - SYNC
15905
2008-07-16 20:15:22 CEST DEBUG2 remoteListenThread_7: queue event 8,7886 SYNC
2008-07-16 20:15:22 CEST DEBUG2 remoteListenThread_7: queue event 9,7896 SYNC
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_8: Received event 8,7886
SYNC
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_9: Received event 9,7896
SYNC
2008-07-16 20:15:22 CEST DEBUG2 calc sync size - last time: 1 last length:
14536 ideal: 4 proposed size: 3
2008-07-16 20:15:22 CEST DEBUG2 calc sync size - last time: 1 last length:
14536 ideal: 4 proposed size: 3
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_8: SYNC 7886 processing
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_8: no sets need syncing
for this event
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_9: SYNC 7896 processing
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_9: no sets need syncing
for this event
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 9,7896
received by 8
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 9,7896
received by 3
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 9,7896
received by 6
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 7,7890
received by 6
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 9,7896
received by 7
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 9,7896
received by 5
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 7,7890
received by 9
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 7,7890
received by 5
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 9,7896
received by 4
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 1,15904
received by 6
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 8,7886
received by 7
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 1,15904
received by 8
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 1,15904
received by 7
2008-07-16 20:15:22 CEST DEBUG2 remoteWorkerThread_7: forward confirm 1,15904
received by 4
2008-07-16 20:15:22 CEST DEBUG2 remoteListenThread_2: queue event 2,7885 SYNC
2008-07-16 20:15:22 CEST DEBUG2 remoteListenThread_2: queue event 3,7881 SYNC

The above messages where logged on my slony masternode

db=# SELECT * from _slony1.sl_set;
 set_id | set_origin | set_locked |   set_comment
--------+------------+------------+------------------
      1 |          1 |            | Set 1 for slony1
      2 |          1 |            | Set 2 for slony1
      3 |          1 |            | Set 3 for slony1
      4 |          1 |            | Set 4 for slony1
(4 rows)

db=# SELECT * from _slony1.sl_status;
 st_origin | st_received | st_last_event |     st_last_event_ts      |
st_last_received |    st_last_received_ts     | st_last_received_event_ts  |
st_lag_num_events |   st_lag_time
-----------+-------------+---------------+---------------------------+-------
-----------+----------------------------+----------------------------+-------
------------+-----------------
         1 |           7 |         16050 | 2008-07-16 20:51:40.34712 |
16049 | 2008-07-16 20:51:47.979432 | 2008-07-16 20:51:25.327307 |
1 | 00:00:17.634575
         1 |           5 |         16050 | 2008-07-16 20:51:40.34712 |
16048 | 2008-07-16 20:51:39.676025 | 2008-07-16 20:51:10.299876 |
2 | 00:00:32.662006
         1 |           9 |         16050 | 2008-07-16 20:51:40.34712 |
16049 | 2008-07-16 20:51:42.70644  | 2008-07-16 20:51:25.327307 |
1 | 00:00:17.634575
         1 |           6 |         16050 | 2008-07-16 20:51:40.34712 |
16049 | 2008-07-16 20:51:37.104608 | 2008-07-16 20:51:25.327307 |
1 | 00:00:17.634575
         1 |           3 |         16050 | 2008-07-16 20:51:40.34712 |
16048 | 2008-07-16 20:51:36.692369 | 2008-07-16 20:51:10.299876 |
2 | 00:00:32.662006
         1 |           8 |         16050 | 2008-07-16 20:51:40.34712 |
16049 | 2008-07-16 20:51:47.674594 | 2008-07-16 20:51:25.327307 |
1 | 00:00:17.634575
         1 |           4 |         16050 | 2008-07-16 20:51:40.34712 |
16049 | 2008-07-16 20:51:37.567119 | 2008-07-16 20:51:25.327307 |
1 | 00:00:17.634575
         1 |           2 |         16050 | 2008-07-16 20:51:40.34712 |
16048 | 2008-07-16 20:51:30.982579 | 2008-07-16 20:51:10.299876 |
2 | 00:00:32.662006
(8 rows)

db=#

Can someone please explain me the error message from above? 

Thx
Br
Felix

-----Original Message-----
From: slony1-general-bounces at lists.slony.info
[mailto:slony1-general-bounces at lists.slony.info] On Behalf Of LINHART Felix
Sent: Mittwoch, 16. Juli 2008 19:03
To: chris
Cc: slony1-general at lists.slony.info
Subject: RE: [Slony1-general] sl_nodelock messages

I have to db clusters on 1 server and 2 other ha-clusters (also with 2 db
clusters) spread over the network.
Since I have these sl_nodelock messages also in my 2 local postgres logs I
don't think that this is caused by a network problem.

Can I somehow enable more logging to analyse this in more detail?

Thx
felix

-----Original Message-----
From: chris [mailto:cbbrowne at ca.afilias.info]
Sent: Mittwoch, 16. Juli 2008 18:52
To: LINHART Felix
Cc: slony1-general at lists.slony.info
Subject: Re: [Slony1-general] sl_nodelock messages

"LINHART Felix" <Felix.LINHART at frequentis.com> writes:
> I have just noticed that my postgres logfile receives continuously 
> messages from slony listed below::p>
>
> :p>
>
> 2008-07-16 12:37:50 CEST NOTICE:  Slony-I: cleanup stale sl_nodelock 
> entry for pid=5810:p>
>
> 2008-07-16 12:37:50 CEST CONTEXT:  SQL statement "SELECT 
> "_slony1".cleanupNodelock()":p>
>
>         PL/pgSQL function "cleanupevent" line 77 at PERFORM:p>
>
> 2008-07-16 12:51:35 CEST LOG:  could not receive data from client: 
> Connection timed out:p>
>
> 2008-07-16 12:51:35 CEST LOG:  unexpected EOF on client connection:p>

Since these are "NOTICE" log records, it does not indicate any error as far
as Slony-I is concerned.

When the slon falls over, and another starts up, the new one has to clean up
after the old one.  It's not a real big deal, though it suggests that you
have some sort of network problem or something that is causing slon processes
to fall over so they need to be restarted.

That falls outside what Slony-I is, itself, doing.
--
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://cbbrowne.com/info/lsf.html Rules  of the  Evil Overlord  #145. "My
dungeon cell  decor  will not feature exposed pipes.  While they add to the
gloomy atmosphere, they are good  conductors of vibrations and  a lot of
prisoners know Morse code." <http://www.eviloverlord.com/>.
_______________________________________________
Slony1-general mailing list
Slony1-general at lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general