Christopher Browne cbbrowne
Wed Jun 7 11:48:51 PDT 2006
Andreas Pflug wrote:
> Christopher Browne wrote:
>
>>
>> I don't see there being a material benefit to be found in switching over
>> to a BINARY format. 
>
> This might be true if the binary data is only small; in case of large
> data and small indexes (something like 100GB TOAST, 20MB index size)
> this is certainly not the case.

>
>  Certainly not when it would lead to a loss of
>> portability.
>
> I agree, though for my use-case (only i86) it is acceptable, I used a
> hard-coded slon version. I'd prefer a better COPY format anyway.
>
> Regards,
> Andreas

On an otherwise-idle system, I took that same table and dumped/loaded in
both formats.

org=# select now(); copy trans_log to '/opt/rg/data_org/etl-text';
select now(); copy trans_log to '/opt/rg/data_org/etl-binary' with
binary; select now();
             now            
-----------------------------
 2006-06-07 17:17:02.0727+00
(1 row)

COPY
             now             
------------------------------
 2006-06-07 17:32:40.91574+00
(1 row)

COPY
              now             
-------------------------------
 2006-06-07 17:37:34.133093+00
(1 row)

org=# select now(); copy etl2 from '/opt/rg/data_org/etl-text'; select
now(); copy etl2 from '/opt/rg/data_org/etl-binary' with binary; select
now();
              now             
-------------------------------
 2006-06-07 18:15:10.604745+00
(1 row)

COPY
             now             
------------------------------
 2006-06-07 18:23:07.64762+00
(1 row)

COPY
              now             
-------------------------------
 2006-06-07 18:30:21.711226+00
(1 row)


pgorg at TOR-570-DB9011:/opt/home/pgorg $ ls -l /opt/rg/data_org/etl*
-rw-r--r--   1 pgorg    postgres 2840059926 Jun  7 17:37
/opt/rg/data_org/etl-binary
-rw-r--r--   1 pgorg    postgres 2482854457 Jun  7 17:32
/opt/rg/data_org/etl-text

Note that the binary form is about 14% larger than the text form; that
would add to network costs...

Summarizing the timings observed:

            Text           Binary
          --------------------------
Dump time   15:38          04:53
Load time   07:57          07:14

It is interesting to note that the binary dump was over 3x faster than
the text dump.  Mind you, since this is, in Slony-I, streaming from
source to destination, that savings is entirely illusory.  The actual
savings would be more likely to be the 9.9% savings in loading time.  To
that we must add indexing time; in the case of this particular table, we
know it takes about 44 minutes, which means that the savings to be had,
here, is 43 seconds over a cost of ~52 minutes, which amounts to a
savings of about 1.3%.

I don't have anything handy that makes interesting use of TOAST tables. 

I'd want to see a benchmark showing a *compelling* benefit before
considering it.  The above certainly isn't compelling.





More information about the Slony1-general mailing list