Discussion:
[pfSense] Connection problems
Oliver Schad
2013-08-24 08:17:56 UTC
Permalink
Hi all,

I have some connection problems with a new pfSense pair: I use a
monitoring system which uses a SQL database to store all monitored
data. The traffic goes through the pfsense since yesterday.

Everything worked fine before for more than one year with a simple
router between.

After migrating to pfSense I get from time to time a connection warning
from the monitoring system, but it's hard to track because it happens
maybe every 2 hours.

I thought it could be a timeout on some TCP connection in the
connection tracking of the pfSense.

Everything is today allowed in the pfSense so it shouldn't block
anything.

Do you have some hints for me to tune the connection tracking timeouts
or maybe to debug this?

Best Regards
Oli
Oliver Schad
2013-08-24 08:26:13 UTC
Permalink
On Sat, 24 Aug 2013 10:17:56 +0200
Post by Oliver Schad
Do you have some hints for me to tune the connection tracking timeouts
or maybe to debug this?
Additional version information:

2.1-RC1 (amd64)
built on Thu Aug 22 23:49:10 EDT 2013
FreeBSD 8.3-RELEASE-p10

Hardware information:

Intel Xeon CPU E3-1220 v3 @ 3.10GHz
8 GB RAM
Intel Dual Port GBit-NIC, 82575EB
120 GB SSD

Traffic information:

Current Traffic is round about 1-1,5 MBit/s continuously without any big
peaks.

Connections:

Round about 3000 connections running, continuously, no big differences

Best Regards
Oli
Oliver Schad
2013-08-24 15:17:41 UTC
Permalink
Some more infos:

$ pfctl -st
tcp.first 120s
tcp.opening 30s
tcp.established 86400s
tcp.closing 900s
tcp.finwait 45s
tcp.closed 90s
tcp.tsdiff 30s
udp.first 60s
udp.single 30s
udp.multiple 60s
icmp.first 20s
icmp.error 10s
other.first 60s
other.single 30s
other.multiple 60s
frag 30s
interval 10s
adaptive.start 0 states
adaptive.end 0 states
src.track 0s

From PostgreSQL Log:

LOG: could not receive data from client: Connection timed out
LOG: unexpected EOF on client connection

This many times.

Best Regards
Oli
Oliver Schad
2013-08-24 16:47:42 UTC
Permalink
State table

$ pfctl -sm
states hard limit 896000
src-nodes hard limit 896000
frags hard limit 5000
tables hard limit 3000
table-entries hard limit 200000

$ pfctl -si
Status: Enabled for 0 days 19:18:06 Debug: Urgent

Interface Stats for igb1_vlan30 IPv4 IPv6
Bytes In 112457 0
Bytes Out 536908 96
Packets In
Passed 728 0
Blocked 0 0
Packets Out
Passed 533 1
Blocked 0 0

State Table Total Rate
current entries 3953
searches 56452552 812.4/s
inserts 1859296 26.8/s
removals 1855343 26.7/s
Counters
match 1879562 27.0/s
bad-offset 0 0.0/s
fragment 0 0.0/s
short 0 0.0/s
normalize 0 0.0/s
memory 0 0.0/s
bad-timestamp 0 0.0/s
congestion 0 0.0/s
ip-option 1110 0.0/s
proto-cksum 7 0.0/s
state-mismatch 1177 0.0/s
state-insert 0 0.0/s
state-limit 0 0.0/s
src-limit 0 0.0/s
synproxy 0 0.0/s
divert 0 0.0/s
Oliver Schad
2013-08-24 19:54:15 UTC
Permalink
I've made a switch to the HA spare device - same behaviour.

Best Regards
Oli
Chris Buechler
2013-08-25 04:14:36 UTC
Permalink
On Sat, Aug 24, 2013 at 3:17 AM, Oliver Schad
Post by Oliver Schad
Hi all,
I have some connection problems with a new pfSense pair: I use a
monitoring system which uses a SQL database to store all monitored
data. The traffic goes through the pfsense since yesterday.
Everything worked fine before for more than one year with a simple
router between.
After migrating to pfSense I get from time to time a connection warning
from the monitoring system, but it's hard to track because it happens
maybe every 2 hours.
Unlikely it's a TCP timeout given the timing of it, likely you have
asymmetric routing somewhere, which is fine with a plain router, but
not fine with a stateful firewall. It can be worked around with sloppy
state rules, but how and where depends on the network setup in
general, where that's happening, and if it's definitely the case.
Oliver Schad
2013-08-25 15:36:45 UTC
Permalink
On Sat, 24 Aug 2013 23:14:36 -0500
Post by Chris Buechler
Unlikely it's a TCP timeout given the timing of it, likely you have
asymmetric routing somewhere, which is fine with a plain router, but
not fine with a stateful firewall. It can be worked around with sloppy
state rules, but how and where depends on the network setup in
general, where that's happening, and if it's definitely the case.
There is only one router between, so it is symmetric. But I have a
overlapping subnet configuration, where I have more specific routes to.
Maybe this is the point where it breaks.

But I'm wondering why this breaks only every 1 or 2 hours with so many
connections.

I will review my setup and give you feedback. Thank you for your hint.

Best Regards
Oli
Oliver Schad
2013-08-26 12:07:05 UTC
Permalink
On Sun, 25 Aug 2013 17:36:45 +0200
Post by Oliver Schad
On Sat, 24 Aug 2013 23:14:36 -0500
Post by Chris Buechler
Unlikely it's a TCP timeout given the timing of it, likely you have
asymmetric routing somewhere, which is fine with a plain router, but
not fine with a stateful firewall. It can be worked around with
sloppy state rules, but how and where depends on the network setup
in general, where that's happening, and if it's definitely the case.
There is only one router between, so it is symmetric. But I have a
overlapping subnet configuration, where I have more specific routes
to. Maybe this is the point where it breaks.
But I'm wondering why this breaks only every 1 or 2 hours with so many
connections.
So waht I can say is that without any filtering (Advanced ->
Firewall/NAT -> disable firewall) it works.

I will migrate the destination network with the DB for testing and
report again. After that both network would be managed by the pfsense
directly.

I don't see a mistake in the routing setup and I don't understand why
routing should fail every one or two hours once.

Best Regards
Oli
Oliver Schad
2013-08-27 16:33:46 UTC
Permalink
On Mon, 26 Aug 2013 14:07:05 +0200
Post by Oliver Schad
So waht I can say is that without any filtering (Advanced ->
Firewall/NAT -> disable firewall) it works.
I will migrate the destination network with the DB for testing and
report again. After that both network would be managed by the pfsense
directly.
I don't see a mistake in the routing setup and I don't understand why
routing should fail every one or two hours once.
After migration (all related networks are routed by pfSense
itselfs) everything works fine. I still don't understand this issue what
gives me a bad feeling. :-/

@chris: thank you for your help.

Best Regards
Oli

Continue reading on narkive:
Loading...