Mikko Partio mpartio at gmail.com
Sun Aug 5 23:39:40 PDT 2007
On 7/29/07, Jan Wieck <JanWieck at yahoo.com> wrote:
>
>
>
>
> This one played quite a bit hide and seek with me. But I think I found
> and fixed the bug.
>
> The problem occurred when some action (like STORE_SET) caused the
> subscriber slon to perform an internal restart. During that, the current
> setsync tracking in the pset structure is reinitialized. However, since
> the sl_setsync table is not updated on events other than SYNC, this
> would lead to the wrong (too low) old sync expected if the last event(s)
> processed from that node where no SYNC events.
>
> Thanks for the detailed example.



My experiments show me that the problem persists. The test-case is the same
as in the previous example (started from scratch):


>
> > 1) Create databases and some tables
> >
> > $ createdb -D data -E utf8 test3
> > $ psql test3
> >
> > test3=3D# create table a (id int primary key);
> > test3=3D# create table b (id int primary key);
> > test3=3D# create table c (id int primary key);
> >
> > $ createdb -T test3 test3_lo
> > $ createdb -T test3 test3_log
> >
> > 2) Create the replication cluster
> >
> > $ cat test3_preamble.slonik
> > cluster name =3D test3;
> >
> > node 1 admin conninfo =3D 'host=3Dlocalhost dbname=3Dtest3 user=3Dpostg=
res
> > password=3Dxxx';
> > node 2 admin conninfo =3D 'host=3Dlocalhost dbname=3Dtest3_lo user=3Dpo=
stgres
> > password=3Dxxx';
> >
> > $ cat test3_init.slonik
> > #!/bin/sh
> >
> > slonik <<_EOF_
> >
> >         include <test3_preamble.slonik>;
> >
> >         init cluster (id=3D1, comment=3D'original database');
> >
> >         store node (id=3D2, comment =3D 'dummy loopback database');
> >         store path (server=3D1, client=3D2, conninfo=3D'dbname=3Dtest3
> > host=3Dlocalhost user=3Dpostgres password=3Dxxx');
> >         store path (server=3D2, client=3D1, conninfo=3D'dbname=3Dtest3_=
lo
> > host=3Dlocalhost user=3Dpostgres password=3Dxxx');
> >         store listen (origin=3D1, provider=3D1, receiver=3D2);
> >         store listen (origin=3D2, provider=3D2, receiver=3D1);
> >
> > _EOF_
> >
> > $ ./test3_init.slonik
> >
> > $ slon test3 "dbname=3Dtest3 host=3Dlocalhost user=3Dpostgres password=
=3Dxxx"
> > $ slon -a /tmp/logshipping/test3 test3 "dbname=3Dtest3_lo host=3Dlocalh=
ost
> > user=3Dpostgres password=3Dxxx"
> >
> > $ cat test3_create_set.slonik
> > #!/bin/sh
> >
> > if [ $# -ne 3 ]; then
> >         echo "Usage: $0 SETID TABLEID TABLENAME"
> >         echo "Example: $0 3 17 public.x"
> >         exit 1
> > fi
> >
> > slonik <<_EOF_
> >
> >         include <test3_preamble.slonik>;
> >
> >         create set (id=3D$1, origin=3D1, comment=3D'Set #$1');
> >         set add table (set id =3D $1, origin =3D 1, id =3D $2, fully qu=
alified
> > name =3D '$3');
> >
> > _EOF_
> >
> > $ ./test3_create_set.slonik 1 1 public.a
> > $ cat test3_subscribe.slonik
> > #!/bin/sh
> >
> > if [ -z $1 ]; then
> >         echo "Usage: $0 SETID"
> >         exit 1;
> > fi
> >
> > slonik <<_EOF_
> >
> >         include <test3_preamble.slonik>;
> >
> >         subscribe set (id=3D$1, provider=3D1, receiver=3D2, forward=3Dn=
o);
> >
> > _EOF_
> >
> > $ ./test3_subscribe.slonik 1
> > $ psql test3_lo
> > test3_lo=3D# \d a
> >        Table " public.a"
> >  Column |  Type   | Modifiers
> > --------+---------+-----------
> >  id     | integer | not null
> > Indexes:
> >     "a_pkey" PRIMARY KEY, btree (id)
> > Triggers:
> >     _test3_denyaccess_1 BEFORE INSERT OR DELETE OR UPDATE ON a FOR EACH
> > ROW EXECUTE PROCEDURE _test3.denyaccess('_test3')
> >
> > $ psql test3
> > test3=3D# insert into a select * from generate_series(1,100);
> >
> > 3) Dump the state of the cluster to logshipping database
> >
> > $ ./slony1_dump.sh test3_lo test3 | psql test3_log
> > $ psql test3_log
> > test3_log=3D# select count(*) from a;
> >  count
> > -------
> >    100
> > (1 row)
> >
> > 4) Insert some data to origin db and see if it gets to logshipping db
> >
> > test3=3D# insert into a select * from generate_series(101,200);
> >
> > $ psql -d test3_log -f
> > /tmp/logshipping/test3/slony1_log_1_00000000000000000060.sql
> > $ psql test3_log
> > test3_log=3D# select count(*) from a;
> >  count
> > -------
> >    200
> > (1 row)
> >
> > 5) Create another replication set
> >
> > $ ./test3_create_set.slonik 10 2 public.b
> > $ ./test3_subscribe.slonik 10
> >
> > 6) Start applying logfiles to logshipping db
> >


Here's the new output:


$ psql -d test3_log -f slony1_log_1_00000000000000000053.sql
START TRANSACTION
psql:slony1_log_1_00000000000000000053.sql:6: NOTICE:  Slony-I: Process set
1 sync 53 time 2007-08-06 09:30:03.823825+03
 setsynctracking_offline
-------------------------
                      53
(1 row)

COMMIT
VACUUM

$ psql -d test3_log -f slony1_log_1_00000000000000000054.sql
START TRANSACTION
psql:slony1_log_1_00000000000000000054.sql:6: NOTICE:  Slony-I: Process set
1 sync 54 time 2007-08-06 09:30:03.823825+03
 setsynctracking_offline
-------------------------
                      54
(1 row)

DELETE 0
 finishtableaftercopy
----------------------
                    1
(1 row)

ANALYZE
INSERT 0 1
COMMIT
VACUUM


$ psql -d test3_log -f slony1_log_1_00000000000000000055.sql
START TRANSACTION
psql:slony1_log_1_00000000000000000055.sql:5: NOTICE:  Slony-I: Process set
1 sync 55 time 2007-08-06 09:30:10.843406+03
 setsynctracking_offline
-------------------------
                      55
(1 row)

psql:slony1_log_1_00000000000000000055.sql:11: ERROR:  Slony-I: set 10 is on
sync 52, this archive log expects 54


Here's the contents of the sync-files 54 and 55:

$ cat slony1_log_1_00000000000000000054.sql
-- Slony-I log shipping archive
-- Node 1, Event 54
start transaction;
-- ENABLE_SUBSCRIPTION

select "_test3".setsyncTracking_offline(1, '53', '54', '2007-08-06 09:30:
03.823825');
-- end of log archiving header
------------------------------------------------------------------
-- start of Slony-I data
------------------------------------------------------------------
delete from "public"."b";
copy "public"."b" ("id") from stdin;
\.
select "_test3".finishTableAfterCopy(2); analyze "public"."b";
insert into "_test3".sl_setsync_offline (ssy_setid, ssy_seqno) values ('10',
'52');

------------------------------------------------------------------
-- End Of Archive Log
------------------------------------------------------------------
commit;
vacuum analyze "_test3".sl_setsync_offline;



$ cat slony1_log_1_00000000000000000055.sql
-- Slony-I log shipping archive
-- Node 1, Event 55
start transaction;

select "_test3".setsyncTracking_offline(1, '54', '55', '2007-08-06 09:30:
10.843406');
-- end of log archiving header
------------------------------------------------------------------
-- start of Slony-I data
------------------------------------------------------------------

select "_test3".setsyncTracking_offline(10, '54', '55', '2007-08-06 09:30:
10.843406');
-- end of log archiving header
------------------------------------------------------------------
-- start of Slony-I data
------------------------------------------------------------------

------------------------------------------------------------------
-- End Of Archive Log
------------------------------------------------------------------
commit;
vacuum analyze "_test3".sl_setsync_offline;



Here's the contents of the setsync-table:


test3_log=3D# select * from _test3.sl_setsync_offline ;
 ssy_setid | ssy_seqno |         ssy_synctime
-----------+-----------+-------------------------------
         1 |        54 | 2007-08-06 09:30:03.823825+03
        10 |        52 |
(2 rows)



$ slon -v
slon version 1.2.11

Any more info I can provide?

Regards

Mikko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20070806/=
d4623d69/attachment.htm


More information about the Slony1-general mailing list