Jeff Frost jeff at pgexperts.com
Sun Feb 16 09:46:51 PST 2014
On Feb 15, 2014, at 11:25 PM, Tory M Blue <tmblue at gmail.com> wrote:

> 
> 
> 
> On Sat, Feb 15, 2014 at 10:48 PM, Jeff Frost <jeff at pgexperts.com> wrote:
> It's probably a firewall timing out your PostgreSQL connection while the indexes are being built on the replica.
> 
> Look into tcp keep alive settings.
> 
> 
> Yes this is what I thought it was when I first started with this, but didn't make any progress. Keepalives by default is set to 7200 seconds, so 2 hours, this is failing in an hour, so  I'll have to look at the firewalls between us but since I'm connected to these boxes the entire time, from the same network that is originating the slon configuration, I'm doubting the firewalls are reaping the connections.
> 
> Looking at the TCP keepalive settings, I don't think there is any tuning there that can help
> 
> net.ipv4.tcp_keepalive_time = 7200
> net.ipv4.tcp_keepalive_probes = 9
> net.ipv4.tcp_keepalive_intvl = 75
> 
> Well, maybe I can "reduce this" just to make some interesting traffic happen within that hour+ that the indexes are being created.


Yah, so 2 hrs means that if your firewall times out in 10 minutes, it's going to kill that idle postgresql connection on you.

This is common in AWS and here are the settings I use in slony 2.2 to fix this:

# TCP keep alive configurations
# Enable sending of TCP keep alive between slon and the PostgreSQL backends
tcp_keepalive = true

# The number of seconds after which a TCP keep alive is sent across an idle
# connection. tcp_keepalive must be enabled for this to take effect. Default
# value of 0 means use operating system default
 tcp_keepalive_idle = 5

# The number of keep alive requests to the server that can be lost before
# the connection is declared dead. tcp_keepalive must be on.Default value
# of 0 means use operating system default
tcp_keepalive_count = 10

# The number of seconds in between TCP keep alive requests. tcp_keepalive
# must be enabled. Default value of 0 means use operating system defaut
tcp_keepalive_interval = 30

That's probably more aggressive than you need, but it should do the trick.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20140216/54401ba2/attachment-0001.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 208 bytes
Desc: Message signed with OpenPGP using GPGMail
Url : http://lists.slony.info/pipermail/slony1-general/attachments/20140216/54401ba2/attachment-0001.pgp 


More information about the Slony1-general mailing list