[Slony1-general] Adding an include("filename") feature to slonik

Thu Mar 3 15:50:37 PST 2005

murphy pope <pope_murphy at hotmail.com> writes:
> (apologies in advance if this has already been discussed and dismissed...)
> Since every slonik script must start with a "cluster" command and a series of node definitions, I'd like to store
> those definitions in a single file and then talk slonik into "including" that file when I run another script.?
> I can think of three ways to do that:
> 1) Add an include(? 'filename' ) mechanism to the slonik grammar
> 2) Add a command-line option that tells slonik to read a named file before executing the main script(s)
> 3) Make slonik read a ~/.slonikrc file
> Solution 1 seems the best choice to me.?
> Would such a patch be welcome?

I don't think any of this would, based on my experiences in a failed
attempt to implement 'named nodes.'  After a digression on why that
didn't work, I'll specifically address the notion of
include('foo.slonik')...

The Named Node "Debacle"
---------------------------

I wanted to be able to attach a "node name" attribute to each node,
and then be able to replace:

STORE LISTEN ( ORIGIN = 1, RECEIVER = 2, PROVIDER = 3 );

 with, let's say:

STORE LISTEN ( ORIGIN NODE = 'org master', RECEIVER NODE = 'data warehouse', PROVIDER NODE = 'whois server' );

I'm sure you'll agree that the latter is much more mnemonic than
something involving perhaps-cryptic numbers :-).

I got started on it, and, at first, it was looking promising.  It
involved adding a number of tokens to the tokenizer, and affected the
parsing of quite a lot of Slonik options, but, with a bit of
struggling with Yacc, it wasn't particularly difficult to get the
syntax working.

THEN I ran into the _REAL_ problem, namely that the change led to a
need for remarkably massive changes in the semantics of the Slonik
statements.  If I referenced a node name, then I'd have to look up in
a database what node that refers to.

But which databases?  If there are 5 nodes, there are 5 choices!  

And if I'm setting up initial configuration, then the data may not
have yet propagated everywhere, so that only one database (which
one???) would be a good choice.

Worse, if configuration broke somehow, and I am trying to repair it,
then there might be inconsistent information out there, and if I draw
node IDs from the wrong place, I'll either do no good, or perhaps
worsen the configuration.

Basically, at the point at which I started the "backend" work as to
what Slonik should do to do the name-to-id mapping, I discovered that
it was, if not mathematically intractable, practically so.  The amount
of work required to get it to provide consistent results wildly
exceeded my desire to have the feature.

Giving Up - Macro Rewriting as alternative
--------------------------------------------------

The next thought was that what I _truly_ desired was to have some sort
of macro rewriting system.  I looked into using M4 as a "helper" to
generate slonik scripts.  That works, to be sure, although there seem
to be some out there that do not consider M4 to be their "favorite
language."

Some notes on the experiment may be found here:

  http://linuxdatabases.info/info/slonikm4.html

I just added some further notes to this in CVS which aren't published
yet...  Jan and I had a chat about this where the thought of using
CPP, as, like M4, it is ubiquitous, but unlike M4, people do not feel
"unclean" when they use it due to the association with Sendmail :-).

The fundamental problem with using CPP is that it generates C-style
comments that slonik won't "eat" happily.  Jan was going to take a
peek at this "in his copious spare time," but that evidently hasn't
happened :-).  If you were interested in doing so, this would be no
bad thing.

Onwards to discussing inclusions
--------------------------------------

Note that I haven't yet said anything direct about why I'd object to
your 1, 2, or 3 items.  Let me get direct now :-).  I certainly don't
mean to rake you over the coals for having a "horrible, terrible, dumb
bad idea;" your idea is, at root, a good one.  But just as named nodes
turned out to be way more trouble than they were worth, implementing
inclusions in Slonik seems likely to me to be much the same, more
trouble than it's worth.

The fundamental problem is that 'inclusions' would turn Slonik into a
2 pass language (like C) rather than 1 (like Pascal), which would
substantially complicate the processor.  Slonik is a very simple
language, a <http://c2.com/cgi/wiki?LittleLanguage>, and adding
features that aren't _directly_ about configuring Slony-I instances
breaks that.

Use cpp/m4/whatever to do the inclusions more or less out of band and
that issue disappears.  Slonik remains a "little language," small and
simple to parse, and you can extend things outside of Slonik as much
as you like.

The following URL discusses some of this in a somewhat less
opinionated manner :-).
<http://linuxdatabases.info/info/usingslonik.html>

There are, in effect, three strategies proposed there to address these
limitations of Slonik:

 1.  Use some form of preprocessor (M4 is discussed; CPP isn't) to
     handle inclusions and symbol substitution

     <http://linuxdatabases.info/info/slonikm4.html>

     This buys you inclusions and symbol substitution; not much more,
     unless you want to court madness :-).

 2.  Embed the Slonik generation inside a shell script

     This is the approach taken for the test bed code in the
     "src/ducttape" directory.

     <http://linuxdatabases.info/info/slonikshell.html>

     This buys you looping (in addition to inclusions/symbol
     substitution).  Using complex data structures in sh courts
     madness ;-).

 3.  Write scripts that generate slonik scripts

     This is what the 'altperl' scripts do.

     <http://linuxdatabases.info/info/altperl.html>

     By having a full-scale language wrapped in, with sophisticated
     data structures and such, you can get as fancy as you like!

Wow, that was pretty long-winded.

At any rate, that is what I have learned after hacking on slonik.
-- 
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://linuxdatabases.org/info/slony.html
A VAX is virtually a computer, but not quite.