Christopher Browne cbbrowne at ca.afilias.info
Fri Nov 26 16:24:48 PST 2010
Jan Wieck <JanWieck at Yahoo.com> writes:
> On 11/26/2010 3:30 PM, Steve Singer wrote:
>> On 10-11-26 03:16 PM, Jan Wieck wrote:
>>>  On 11/17/2010 5:06 PM, Christopher Browne wrote:
>>>>  A thing that several of us have been ruminating over for a while is the
>>>>  problem that people get confused about how you submit Slonik scripts,
>>>>  you may have some actions that require waits.
>>>
>>>  One major problem with automatic waiting for events is that it is
>>>  extremely context sensitive to wait at all.
>>>
>>>  One cannot wait inside of a TRY block. The events aren't committed yet,
>>>  so they cannot propagate.
>>>
>>>  One needs to be careful not to wait if the current path configuration is
>>>  incomplete. You should for example never wait for the FIRST store path
>>>  ... you'd wait forever.
>>>
>>>  This all basically exposes that slonik has insufficient knowledge about
>>>  the overall cluster configuration and healthiness. It basically fires
>>>  off "commands" blindly. I've long been thinking that slonik itself needs
>>>  a major overhaul. My recent experiences with the Mozilla Rhino
>>>  JavaScript engine (Steve and I developed a cluster test framework using
>>>  it) makes me think that actually creating a complete new slonik from
>>>  scratch won't be too bad of an idea.
>>
>> What would you want to change in slonik (from the users point of view)
>> if you were doing that?
>>
>> When I was developing tests for the framework I spent countless hours
>> debugging tests were the problem ended up being a missing or an extra
>> wait for (or one against the wrong node). Expecting the average DBA to
>> figure this stuff out isn't nice.   Slonik should be able to figure out
>> if paths exist to the required nodes and other dependencies on the
>> configuration.  I think slonik should check as many things as we can
>> make and give the user useful error messages instead of 'waiting for ever'
>>
>> I agree with Vick that a Java dependency for slony would be a bad idea.
>
> Would it really be such a bad idea?

I think it's premature to say anything about implementation language,
when we don't yet have a systematic description of the replacement. (I
note that you start such here... [1])

> The system where slonik (or an alternative to it) is running does NOT 
> have to be any of the DB servers or systems, where even slon is running. 
> It can be the sysadmin's laptop for all that matters.
>
> Anyhow, I don't think the language is that important at this point in 
> the discussion.  Way more important is the question how much we would 
> like slonik to actually "know".

Indeed.

[1]
> If I had to write slonik again it would on startup connect to each node, 
> that it has an admin-conninfo for, and read the whole cluster 
> configuration from its sl_* tables. It would update that information 
> before executing any command.
>
> This way it would "know" which sets exist, which nodes are subscribed to 
> them (and up to what level, like in-progress). It would know that a 
> certain node isn't the origin of a certain set "yet".

This starts a "systematic description" of what's different from slonik
as it is today, which is nice to have start.

I don't think it'll be enough to just say, "oh, some commands will have
some changed semantics."  Rather, I expect that we'll have some new
abstractions in Slonik.

As a start...

1. Defining admin connections explicitly
 
   Today, we have a preamble consisting of
      CLUSTER NAME...
      ADMIN CONN INFO ... (for each node, which is never stored in DB)

    Sounds like this changes to
      CLUSTER NAME...
      ADMIN CONN INFO [specify ONE node to talk to]

    ... and we need to have a new table + command to store the ADMIN
        CONN INFO ...

2. Capturing current cluster state

   There's a new semantic here; state needs to be loaded in, sometimes.

   I'm not yet sure why or when, or what the basis would be to re-load
   state.  (I'm not challenging you - just saying that for the system to
   be predictable, there has to be some predictable basis for
   loading and reloading state.  I don't know what to call it yet.)

   There needs to be some basis for using cluster state.  That needs to
   be described, and we need a name for this, so it's not "oh, we need
   to do magic..."

There are likely more abstract bits that I'm not thinking of yet.  I
need to think about this more.

And if we don't have descriptive names for the new things, then they're
not well enough defined, yet.  Or maybe we need a diagram...

> Of course, this sort of safety could also be achieved by just extending 
> the current slonik code.

I imagine so.  None of what you have mentioned thus far intrinsically
points to functionality that couldn't be captured by (scan.l + parser.y
+ slonik.c + dbutil.c).

> In any case, I do think we need to make slonik smarter. I just don't 
> know if C is still the right language to do it in.

I don't think we need to think about that at all until we determine with
considerably greater exactitude what "smarter" means.
-- 
output = ("cbbrowne" "@" "ca.afilias.info")
Christopher Browne
"Bother,"  said Pooh,  "Eeyore, ready  two photon  torpedoes  and lock
phasers on the Heffalump, Piglet, meet me in transporter room three"


More information about the Slony1-general mailing list