Christopher Browne cbbrowne
Thu Jul 28 21:37:23 PDT 2005
Jan Wieck wrote:

> On 7/21/2005 9:56 AM, Hannu Krosing wrote:
>
>> Hi
>>
>> I have started to have locking issues on pg_listener (i.e listen/notify)
>> and so I have again started to think about my old request of running
>> slony without listen/notify.
>
>
> The slon process only updates the local database ever. All remote
> slon's do select only. So whatever a slon wants to communicate to
> others who wait for events, he will insert something into sl_event,
> then NOTIFY. All the remote nodes are listening, see the notification
> and know that they now have to select from sl_event. When a node is
> done processing a remote event, it will record that in sl_confirm and
> notify on that. Now all other nodes that have a connection to here
> will pick up the new confirm status by selecting from sl_confirm.
>
> The idea to reduce the amount of notifications actually done via
> NOTIFY is to keep track when the last ones were created and received.
> After creating a NOTIFY locally, subsequent ones will be suppressed
> for some amount of time (lets say 600 seconds). When a remote slon
> receives a notification, it will enter a polling state for that time
> plus some more  (lets say 660 seconds). During that polling time, it
> will just look every n milliseconds if there are new events - n being
> a new config parameter in sl_listen as well as the time to poll.
>
> On a constantly busy database with only one subscriber and -s1000, we
> currently have one event and one confirm per second. That makes 7200+
> (the subscriber does a few events as well) NOTIFY calls per hour. With
> the above we'd reduce this to about 12 NOTIFY calls per node.
>
> What I just don't have is the time to actually implement it.

This still strikes me as something of an "ugly hack," particularly as I
take a look at where NOTIFY/LISTEN are used.

One direction that strikes me as being a shade better would be to
eliminate the confirmation events, and allow those to be "drawn back"
via the periodic SYNCs that each node emits.

Here's a change to the notification code (in remote_worker.c) which
eliminates around 90% of the confirmation notifications:

    slon_appendquery(dsp,
             "notify \"_%s_Event\"; ", rtcfg_cluster_name);
    if ((rand() % 10) == 0) {
        slon_appendquery(dsp, "notify \"_%s_Confirm\"; ",
rtcfg_cluster_name);
    }
    slon_appendquery(dsp,
             "insert into %s.sl_event "
             "    (ev_origin, ev_seqno, ev_timestamp, "
             "     ev_minxid, ev_maxxid, ev_xip, ev_type ",
             rtcfg_namespace);

That used to be one slon_appendquery() call that *always* did a Confirm
notification.

All at least seems to be well; good old "test_1_pgbench" runs to
completion and the databases compare fine.

By eliminating 90% of confirmation notifications, that should cut down
on pg_listener "traffic" by about 45%.  If that helps, that represents
an easy, material improvement (even if it's only cutting traffic by ~1/2).

That's pretty simple, with the result that it shouldn't be risky, and
has a material effect.  I'm not sure it's even reasonably considered a
"dirty" hack.

If I fold the call to remoteListen_forward_confirm(node, conn) in with
remoteListen_receive_events(), so that it *always* tries to read
confirmations, upon receiving *any* event (or, equivalently, pretend
that forward_confirm is always true), that should address the issue
that  of the confirmations falling somewhat behind.

I tried the latter; it definitely works, and I don't see a material
downside.  Actually, it seems like a good idea.  We run through a bunch
of events, and then run through confirmations.  That seems pretty
sensible to me, and THAT would eliminate the need for the Confirm events
altogether.

>> Could somebody explain me what is the function of LISTEN/NOTIFY in
>> slony ?
>>
>> I can't imagine them being used for anything really important (as they
>> are not all quaranteed to arrive to client at all), and thus I assume
>> that they are used just as a pacekeeping device in order to not overload
>> the database with slony requests.
>>
>> Is my assumption right ?
>
That's backwards.

They aren't for "pacekeeping;" they are there to tell other nodes that
there is work to do.

It's sort of an inverse to "pacekeeping;" the busier the system is, the
more frequently you want to generate events in order to ensure that that
SYNCs remain relatively bite-sized rather than growing into "4GB
memory-busters."

>> In this case, I think I could just remove all "notify XXX" commands from
>> slony queries and replace the function waiting for notifies with a dummy
>> (or maybe a 50ms wait), as I run a database which is under heavy OLTP
>> load and doesalways have something new for slony by the time it finishes
>> its processing.
>>
>>
>> Otoh, if the events got by LISTENing are used for something essential,
>> could I just have a single separate thread that sends all possible
>> notifies in a tight loop and keep the listener function intact ?
>
>
> As a dirty hack you can switch to a constant polling scheme and just
> live with once every n msec SELECT on a completely idle database.
> Since you are using dedicated DB servers, it wouldn't hurt you at all.

If the system is real busy, it could well make sense to revert to polling.

Hmm...

-> If you have received more than X events in Y milliseconds, then jump
into polling mode

Polling Mode:
  UNLISTEN on the relevant events
  begin loop
    Z = now()
    Draw new events, process them
    LEN = now() - Z
    Sleep for Y - LEN ms
    If no events processed the last few loops, then return to
NOTIFICATION loop
  end loop

That's essentially an alternative to the present
remoteListenThread_main() work loop.

I'm NOT inclined to implement that immediately, but perhaps after OSCON,
if the idea sticks to the wall and anything thrown at it fails to
dislodge it, that doesn't look too hard, and I daresay it's not THAT
much of a hack... :-).




More information about the Slony1-general mailing list