Discussion:
[pfSense] terrible performance on NFS & CIFS
Adam Thompson
2014-11-06 00:47:21 UTC
Permalink
Problem: really, really bad performance (<10Mbps) on both NFS (both tcp
and udp) and CIFS through pfSense.

Proximate cause: running a packet capture on the Client shows one
smoking gun - the TCP window size on packets sent from the client is
always ~1444 bytes. Packets arriving from the server show a TCP window
size of ~32k.


The Network:
+------+
|Router|
+--+---+
|
--+----+----+--
| |
+--+---+ +-------+
|Client| |pfSense|
+------+ +--+----+
|
--+---+--
|
+--+---+
|Server|
+------+

- Client and pfSense both have Router as default gateway.
- pfSense has custom outbound NAT rules preventing NAT between
Server subnet and Client subnet, but NAT'ing all other - outbound
connections.
- Router has static route pointing to Server subnet via pfSense.

Hardware:
Router is an OpenBSD system (a CARP cluster, actually) running on
silly-overpowered hardware.
Client is actually multiple systems, ranging from laptops to
high-end servers.
Server is a Xeon E3-1230v3 running Linux, exporting a filesystem
via both NFS (v2, v3 & v4) and CIFS (samba).
pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage
typically peaks at around 5%.


Performance on local Server subnet (i.e. from a same-subnet client) is
very good on all protocols, nearly saturating the gigabit link.
Traffic outbound from the server subnet to the internet (via Router)
moves at a decent pace, this firewall can typically handle ~400Mbps
without any trouble, IIRC synthetic benchmarks previously showed it can
peak at over 800Mbps.

Based on the FUBAR TCP window sizes I've observed, I assume pfSense is
doing something to my TCP connections... but why are only the non-NAT'd
connections affected? I know there's an option to disable pf scrub, but
that's only supposed to affect NFSv3 (AFAIK), and this also affects
NFSv4-over-TCP and CIFS.
--
-Adam Thompson
***@athompso.net
Sean
2014-11-06 22:58:35 UTC
Permalink
Not a TCP expert but the MTU is nearly always 1500 (or just under) hence
your limit. Sending packets greater than the MTU will lead to
fragmentation. Fragmentation leads to re-transmissions (depends on do not
fragment bit?) and performance problems. Performance problems leads to
frustration and anger. Anger leads to the dark side of the force.

You can increase the MTU to like 9000 or something if you enable jumbo
frames but you'd need to support it across the board (pfSense, routers,
switches?, servers, etc.). It's a hassle probably not worth the effort in
terms of gains. Some people do it as a means to increase iSCSI traffic
performance but others say the throughput gain is dubious at best. I would
make sure some doofus didn't enable jumbo frames on your NFS server and if
so then turn it off and check the MTU setting in the network stack on the
NFS server as well.

I may not know what the hell i'm talking about though so someone else can
feel free to jump in and tell me what an idiot I am.
Post by Adam Thompson
Problem: really, really bad performance (<10Mbps) on both NFS (both tcp
and udp) and CIFS through pfSense.
Proximate cause: running a packet capture on the Client shows one smoking
gun - the TCP window size on packets sent from the client is always ~1444
bytes. Packets arriving from the server show a TCP window size of ~32k.
+------+
|Router|
+--+---+
|
--+----+----+--
| |
+--+---+ +-------+
|Client| |pfSense|
+------+ +--+----+
|
--+---+--
|
+--+---+
|Server|
+------+
- Client and pfSense both have Router as default gateway.
- pfSense has custom outbound NAT rules preventing NAT between Server
subnet and Client subnet, but NAT'ing all other - outbound connections.
- Router has static route pointing to Server subnet via pfSense.
Router is an OpenBSD system (a CARP cluster, actually) running on
silly-overpowered hardware.
Client is actually multiple systems, ranging from laptops to high-end
servers.
Server is a Xeon E3-1230v3 running Linux, exporting a filesystem via
both NFS (v2, v3 & v4) and CIFS (samba).
pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage typically
peaks at around 5%.
Performance on local Server subnet (i.e. from a same-subnet client) is
very good on all protocols, nearly saturating the gigabit link.
Traffic outbound from the server subnet to the internet (via Router) moves
at a decent pace, this firewall can typically handle ~400Mbps without any
trouble, IIRC synthetic benchmarks previously showed it can peak at over
800Mbps.
Based on the FUBAR TCP window sizes I've observed, I assume pfSense is
doing something to my TCP connections... but why are only the non-NAT'd
connections affected? I know there's an option to disable pf scrub, but
that's only supposed to affect NFSv3 (AFAIK), and this also affects
NFSv4-over-TCP and CIFS.
--
-Adam Thompson
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
Adam Thompson
2014-11-06 23:09:19 UTC
Permalink
Well, that would definitely cause a problem if it were the case, but...
1) TCP window size != MTU,
2) all switches and Router (but not pfSense) can both handle 9000-byte frames anyway,
3) MTU on server and client are both standard, at 1514,
4) I can confirm no fragmentation is occurring.

Still don't know why performance is so bad, though.

-Adam
Post by Sean
Not a TCP expert but the MTU is nearly always 1500 (or just under) hence
your limit. Sending packets greater than the MTU will lead to
fragmentation. Fragmentation leads to re-transmissions (depends on do not
fragment bit?) and performance problems. Performance problems leads to
frustration and anger. Anger leads to the dark side of the force.
You can increase the MTU to like 9000 or something if you enable jumbo
frames but you'd need to support it across the board (pfSense, routers,
switches?, servers, etc.). It's a hassle probably not worth the effort in
terms of gains. Some people do it as a means to increase iSCSI traffic
performance but others say the throughput gain is dubious at best. I would
make sure some doofus didn't enable jumbo frames on your NFS server and if
so then turn it off and check the MTU setting in the network stack on the
NFS server as well.
I may not know what the hell i'm talking about though so someone else can
feel free to jump in and tell me what an idiot I am.
Post by Adam Thompson
Problem: really, really bad performance (<10Mbps) on both NFS (both
tcp
Post by Adam Thompson
and udp) and CIFS through pfSense.
Proximate cause: running a packet capture on the Client shows one
smoking
Post by Adam Thompson
gun - the TCP window size on packets sent from the client is always
~1444
Post by Adam Thompson
bytes. Packets arriving from the server show a TCP window size of
~32k.
Post by Adam Thompson
+------+
|Router|
+--+---+
|
--+----+----+--
| |
+--+---+ +-------+
|Client| |pfSense|
+------+ +--+----+
|
--+---+--
|
+--+---+
|Server|
+------+
- Client and pfSense both have Router as default gateway.
- pfSense has custom outbound NAT rules preventing NAT between
Server
Post by Adam Thompson
subnet and Client subnet, but NAT'ing all other - outbound
connections.
Post by Adam Thompson
- Router has static route pointing to Server subnet via pfSense.
Router is an OpenBSD system (a CARP cluster, actually) running on
silly-overpowered hardware.
Client is actually multiple systems, ranging from laptops to
high-end
Post by Adam Thompson
servers.
Server is a Xeon E3-1230v3 running Linux, exporting a filesystem
via
Post by Adam Thompson
both NFS (v2, v3 & v4) and CIFS (samba).
pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage
typically
Post by Adam Thompson
peaks at around 5%.
Performance on local Server subnet (i.e. from a same-subnet client)
is
Post by Adam Thompson
very good on all protocols, nearly saturating the gigabit link.
Traffic outbound from the server subnet to the internet (via Router)
moves
Post by Adam Thompson
at a decent pace, this firewall can typically handle ~400Mbps without
any
Post by Adam Thompson
trouble, IIRC synthetic benchmarks previously showed it can peak at
over
Post by Adam Thompson
800Mbps.
Based on the FUBAR TCP window sizes I've observed, I assume pfSense
is
Post by Adam Thompson
doing something to my TCP connections... but why are only the
non-NAT'd
Post by Adam Thompson
connections affected? I know there's an option to disable pf scrub,
but
Post by Adam Thompson
that's only supposed to affect NFSv3 (AFAIK), and this also affects
NFSv4-over-TCP and CIFS.
--
-Adam Thompson
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
------------------------------------------------------------------------
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Espen Johansen
2014-11-06 23:14:52 UTC
Permalink
Just a guess but...any chance you have BCM NICs?
Post by Adam Thompson
Well, that would definitely cause a problem if it were the case, but...
1) TCP window size != MTU,
2) all switches and Router (but not pfSense) can both handle 9000-byte frames anyway,
3) MTU on server and client are both standard, at 1514,
4) I can confirm no fragmentation is occurring.
Still don't know why performance is so bad, though.
-Adam
Post by Sean
Not a TCP expert but the MTU is nearly always 1500 (or just under) hence
your limit. Sending packets greater than the MTU will lead to
fragmentation. Fragmentation leads to re-transmissions (depends on do not
fragment bit?) and performance problems. Performance problems leads to
frustration and anger. Anger leads to the dark side of the force.
You can increase the MTU to like 9000 or something if you enable jumbo
frames but you'd need to support it across the board (pfSense, routers,
switches?, servers, etc.). It's a hassle probably not worth the effort in
terms of gains. Some people do it as a means to increase iSCSI traffic
performance but others say the throughput gain is dubious at best. I would
make sure some doofus didn't enable jumbo frames on your NFS server and if
so then turn it off and check the MTU setting in the network stack on the
NFS server as well.
I may not know what the hell i'm talking about though so someone else can
feel free to jump in and tell me what an idiot I am.
Post by Adam Thompson
Problem: really, really bad performance (<10Mbps) on both NFS (both tcp
and udp) and CIFS through pfSense.
Proximate cause: running a packet capture on the Client shows one
smoking gun - the TCP window size on packets sent from the client is always
~1444 bytes. Packets arriving from the server show a TCP window size of
~32k.
+------+
|Router|
+--+---+
|
--+----+----+--
| |
+--+---+ +-------+
|Client| |pfSense|
+------+ +--+----+
|
--+---+--
|
+--+---+
|Server|
+------+
- Client and pfSense both have Router as default gateway.
- pfSense has custom outbound NAT rules preventing NAT between
Server subnet and Client subnet, but NAT'ing all other - outbound
connections.
- Router has static route pointing to Server subnet via pfSense.
Router is an OpenBSD system (a CARP cluster, actually) running on
silly-overpowered hardware.
Client is actually multiple systems, ranging from laptops to
high-end servers.
Server is a Xeon E3-1230v3 running Linux, exporting a filesystem via
both NFS (v2, v3 & v4) and CIFS (samba).
pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage typically
peaks at around 5%.
Performance on local Server subnet (i.e. from a same-subnet client) is
very good on all protocols, nearly saturating the gigabit link.
Traffic outbound from the server subnet to the internet (via Router)
moves at a decent pace, this firewall can typically handle ~400Mbps without
any trouble, IIRC synthetic benchmarks previously showed it can peak at
over 800Mbps.
Based on the FUBAR TCP window sizes I've observed, I assume pfSense is
doing something to my TCP connections... but why are only the non-NAT'd
connections affected? I know there's an option to disable pf scrub, but
that's only supposed to affect NFSv3 (AFAIK), and this also affects
NFSv4-over-TCP and CIFS.
--
-Adam Thompson
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
------------------------------
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
Sean
2014-11-06 23:12:59 UTC
Permalink
I strongly recommend not tinkering with your MTU setting and instead
correct the setting on the server side...

I think you should start reading here:
http://nfs.sourceforge.net/nfs-howto/ar01s05.html
5.3. Overflow of Fragmented Packets
Using an *rsize* or *wsize* larger than your network's MTU (often set to
1500, in many networks) will cause IP packet fragmentation when using NFS
over UDP. IP packet fragmentation and reassembly require a significant
amount of CPU resource at both ends of a network connection. In addition,
packet fragmentation also exposes your network traffic to greater
unreliability, since a complete RPC request must be retransmitted if a UDP
packet fragment is dropped for any reason. Any increase of RPC
retransmissions, along with the possibility of increased timeouts, are the
single worst impediment to performance for NFS over UDP.
Packets may be dropped for many reasons. If your network topography is
complex, fragment routes may differ, and may not all arrive at the Server
for reassembly. NFS Server capacity may also be an issue, since the kernel
has a limit of how many fragments it can buffer before it starts throwing
away packets. With kernels that support the /proc filesystem, you can
monitor the files /proc/sys/net/ipv4/ipfrag_high_thresh and
/proc/sys/net/ipv4/ipfrag_low_thresh. Once the number of unprocessed,
fragmented packets reaches the number specified by *ipfrag_high_thresh* (in
bytes), the kernel will simply start throwing away fragmented packets until
the number of incomplete packets reaches the number specified by
*ipfrag_low_thresh*.
Another counter to monitor is *IP: ReasmFails* in the file /proc/net/snmp;
this is the number of fragment reassembly failures. if it goes up too
quickly during heavy file activity, you may have a problem.
Since this is not an NFS support list I suggest you let this die here lest
you incur the spite of the moderators. ;-)
Not a TCP expert but the MTU is nearly always 1500 (or just under) hence
your limit. Sending packets greater than the MTU will lead to
fragmentation. Fragmentation leads to re-transmissions (depends on do not
fragment bit?) and performance problems. Performance problems leads to
frustration and anger. Anger leads to the dark side of the force.
You can increase the MTU to like 9000 or something if you enable jumbo
frames but you'd need to support it across the board (pfSense, routers,
switches?, servers, etc.). It's a hassle probably not worth the effort in
terms of gains. Some people do it as a means to increase iSCSI traffic
performance but others say the throughput gain is dubious at best. I would
make sure some doofus didn't enable jumbo frames on your NFS server and if
so then turn it off and check the MTU setting in the network stack on the
NFS server as well.
I may not know what the hell i'm talking about though so someone else can
feel free to jump in and tell me what an idiot I am.
Post by Adam Thompson
Problem: really, really bad performance (<10Mbps) on both NFS (both tcp
and udp) and CIFS through pfSense.
Proximate cause: running a packet capture on the Client shows one smoking
gun - the TCP window size on packets sent from the client is always ~1444
bytes. Packets arriving from the server show a TCP window size of ~32k.
+------+
|Router|
+--+---+
|
--+----+----+--
| |
+--+---+ +-------+
|Client| |pfSense|
+------+ +--+----+
|
--+---+--
|
+--+---+
|Server|
+------+
- Client and pfSense both have Router as default gateway.
- pfSense has custom outbound NAT rules preventing NAT between Server
subnet and Client subnet, but NAT'ing all other - outbound connections.
- Router has static route pointing to Server subnet via pfSense.
Router is an OpenBSD system (a CARP cluster, actually) running on
silly-overpowered hardware.
Client is actually multiple systems, ranging from laptops to high-end
servers.
Server is a Xeon E3-1230v3 running Linux, exporting a filesystem via
both NFS (v2, v3 & v4) and CIFS (samba).
pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage typically
peaks at around 5%.
Performance on local Server subnet (i.e. from a same-subnet client) is
very good on all protocols, nearly saturating the gigabit link.
Traffic outbound from the server subnet to the internet (via Router)
moves at a decent pace, this firewall can typically handle ~400Mbps without
any trouble, IIRC synthetic benchmarks previously showed it can peak at
over 800Mbps.
Based on the FUBAR TCP window sizes I've observed, I assume pfSense is
doing something to my TCP connections... but why are only the non-NAT'd
connections affected? I know there's an option to disable pf scrub, but
that's only supposed to affect NFSv3 (AFAIK), and this also affects
NFSv4-over-TCP and CIFS.
--
-Adam Thompson
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
Adam Thompson
2014-11-06 23:23:07 UTC
Permalink
Ok, recap again...
- this affects multiple protocols, not just NFS. I've now confirmed it affects SSH as well.
- this only occurs when the server is behind pfSense and the client is on the "outside" of the firewall.
- this problem does not occur in the other direction through pfSense (LAN->WAN).
- to repeat myself, NFS works fine at ~1gbps between the same client and server without pfSense in the middle.

Ergo, I conclude it's something pfSense-related. Haven't had a chance to turn off of scrub yet.
-Adam
Post by Sean
I strongly recommend not tinkering with your MTU setting and instead
correct the setting on the server side...
http://nfs.sourceforge.net/nfs-howto/ar01s05.html
5.3. Overflow of Fragmented Packets
Using an *rsize* or *wsize* larger than your network's MTU (often set
to
1500, in many networks) will cause IP packet fragmentation when using
NFS
over UDP. IP packet fragmentation and reassembly require a
significant
amount of CPU resource at both ends of a network connection. In
addition,
packet fragmentation also exposes your network traffic to greater
unreliability, since a complete RPC request must be retransmitted if
a UDP
packet fragment is dropped for any reason. Any increase of RPC
retransmissions, along with the possibility of increased timeouts,
are the
single worst impediment to performance for NFS over UDP.
Packets may be dropped for many reasons. If your network topography
is
complex, fragment routes may differ, and may not all arrive at the
Server
for reassembly. NFS Server capacity may also be an issue, since the
kernel
has a limit of how many fragments it can buffer before it starts
throwing
away packets. With kernels that support the /proc filesystem, you can
monitor the files /proc/sys/net/ipv4/ipfrag_high_thresh and
/proc/sys/net/ipv4/ipfrag_low_thresh. Once the number of unprocessed,
fragmented packets reaches the number specified by
*ipfrag_high_thresh* (in
bytes), the kernel will simply start throwing away fragmented packets
until
the number of incomplete packets reaches the number specified by
*ipfrag_low_thresh*.
Another counter to monitor is *IP: ReasmFails* in the file
/proc/net/snmp;
this is the number of fragment reassembly failures. if it goes up too
quickly during heavy file activity, you may have a problem.
Since this is not an NFS support list I suggest you let this die here lest
you incur the spite of the moderators. ;-)
Not a TCP expert but the MTU is nearly always 1500 (or just under)
hence
your limit. Sending packets greater than the MTU will lead to
fragmentation. Fragmentation leads to re-transmissions (depends on
do not
fragment bit?) and performance problems. Performance problems leads
to
frustration and anger. Anger leads to the dark side of the force.
You can increase the MTU to like 9000 or something if you enable
jumbo
frames but you'd need to support it across the board (pfSense,
routers,
switches?, servers, etc.). It's a hassle probably not worth the
effort in
terms of gains. Some people do it as a means to increase iSCSI
traffic
performance but others say the throughput gain is dubious at best. I
would
make sure some doofus didn't enable jumbo frames on your NFS server
and if
so then turn it off and check the MTU setting in the network stack on
the
NFS server as well.
I may not know what the hell i'm talking about though so someone else
can
feel free to jump in and tell me what an idiot I am.
Post by Adam Thompson
Problem: really, really bad performance (<10Mbps) on both NFS (both
tcp
Post by Adam Thompson
and udp) and CIFS through pfSense.
Proximate cause: running a packet capture on the Client shows one
smoking
Post by Adam Thompson
gun - the TCP window size on packets sent from the client is always
~1444
Post by Adam Thompson
bytes. Packets arriving from the server show a TCP window size of
~32k.
Post by Adam Thompson
+------+
|Router|
+--+---+
|
--+----+----+--
| |
+--+---+ +-------+
|Client| |pfSense|
+------+ +--+----+
|
--+---+--
|
+--+---+
|Server|
+------+
- Client and pfSense both have Router as default gateway.
- pfSense has custom outbound NAT rules preventing NAT between
Server
Post by Adam Thompson
subnet and Client subnet, but NAT'ing all other - outbound
connections.
Post by Adam Thompson
- Router has static route pointing to Server subnet via pfSense.
Router is an OpenBSD system (a CARP cluster, actually) running
on
Post by Adam Thompson
silly-overpowered hardware.
Client is actually multiple systems, ranging from laptops to
high-end
Post by Adam Thompson
servers.
Server is a Xeon E3-1230v3 running Linux, exporting a filesystem
via
Post by Adam Thompson
both NFS (v2, v3 & v4) and CIFS (samba).
pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage
typically
Post by Adam Thompson
peaks at around 5%.
Performance on local Server subnet (i.e. from a same-subnet client)
is
Post by Adam Thompson
very good on all protocols, nearly saturating the gigabit link.
Traffic outbound from the server subnet to the internet (via Router)
moves at a decent pace, this firewall can typically handle ~400Mbps
without
Post by Adam Thompson
any trouble, IIRC synthetic benchmarks previously showed it can peak
at
Post by Adam Thompson
over 800Mbps.
Based on the FUBAR TCP window sizes I've observed, I assume pfSense
is
Post by Adam Thompson
doing something to my TCP connections... but why are only the
non-NAT'd
Post by Adam Thompson
connections affected? I know there's an option to disable pf scrub,
but
Post by Adam Thompson
that's only supposed to affect NFSv3 (AFAIK), and this also affects
NFSv4-over-TCP and CIFS.
--
-Adam Thompson
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
------------------------------------------------------------------------
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Sean
2014-11-07 03:53:40 UTC
Permalink
Ah, my bad... I kind of glazed over the CIFS bit. ;-)
Have you compared a packet capture of client traffic while it's on the LAN
performing at 1gb to the capture through pfSense?
The TCP Window Size could be a red herring...?
Post by Adam Thompson
Ok, recap again...
- this affects multiple protocols, not just NFS. I've now confirmed it
affects SSH as well.
- this only occurs when the server is behind pfSense and the client is on
the "outside" of the firewall.
- this problem does not occur in the other direction through pfSense (LAN->WAN).
- to repeat myself, NFS works fine at ~1gbps between the same client and
server without pfSense in the middle.
Ergo, I conclude it's something pfSense-related. Haven't had a chance to
turn off of scrub yet.
-Adam
Post by Sean
I strongly recommend not tinkering with your MTU setting and instead
correct the setting on the server side...
http://nfs.sourceforge.net/nfs-howto/ar01s05.html
5.3. Overflow of Fragmented Packets
Using an *rsize* or *wsize* larger than your network's MTU (often set
to 1500, in many networks) will cause IP packet fragmentation when using
NFS over UDP. IP packet fragmentation and reassembly require a significant
amount of CPU resource at both ends of a network connection. In addition,
packet fragmentation also exposes your network traffic to greater
unreliability, since a complete RPC request must be retransmitted if a UDP
packet fragment is dropped for any reason. Any increase of RPC
retransmissions, along with the possibility of increased timeouts, are the
single worst impediment to performance for NFS over UDP.
Packets may be dropped for many reasons. If your network topography is
complex, fragment routes may differ, and may not all arrive at the Server
for reassembly. NFS Server capacity may also be an issue, since the kernel
has a limit of how many fragments it can buffer before it starts throwing
away packets. With kernels that support the /proc filesystem, you can
monitor the files /proc/sys/net/ipv4/ipfrag_high_thresh and
/proc/sys/net/ipv4/ipfrag_low_thresh. Once the number of unprocessed,
fragmented packets reaches the number specified by *ipfrag_high_thresh* (in
bytes), the kernel will simply start throwing away fragmented packets until
the number of incomplete packets reaches the number specified by
*ipfrag_low_thresh*.
Another counter to monitor is *IP: ReasmFails* in the file
/proc/net/snmp; this is the number of fragment reassembly failures. if
it goes up too quickly during heavy file activity, you may have a problem.
Since this is not an NFS support list I suggest you let this die here
lest you incur the spite of the moderators. ;-)
Not a TCP expert but the MTU is nearly always 1500 (or just under) hence
your limit. Sending packets greater than the MTU will lead to
fragmentation. Fragmentation leads to re-transmissions (depends on do not
fragment bit?) and performance problems. Performance problems leads to
frustration and anger. Anger leads to the dark side of the force.
You can increase the MTU to like 9000 or something if you enable jumbo
frames but you'd need to support it across the board (pfSense, routers,
switches?, servers, etc.). It's a hassle probably not worth the effort in
terms of gains. Some people do it as a means to increase iSCSI traffic
performance but others say the throughput gain is dubious at best. I would
make sure some doofus didn't enable jumbo frames on your NFS server and if
so then turn it off and check the MTU setting in the network stack on the
NFS server as well.
I may not know what the hell i'm talking about though so someone else
can feel free to jump in and tell me what an idiot I am.
Post by Adam Thompson
Problem: really, really bad performance (<10Mbps) on both NFS (both tcp
and udp) and CIFS through pfSense.
Proximate cause: running a packet capture on the Client shows one
smoking gun - the TCP window size on packets sent from the client is always
~1444 bytes. Packets arriving from the server show a TCP window size of
~32k.
+------+
|Router|
+--+---+
|
--+----+----+--
| |
+--+---+ +-------+
|Client| |pfSense|
+------+ +--+----+
|
--+---+--
|
+--+---+
|Server|
+------+
- Client and pfSense both have Router as default gateway.
- pfSense has custom outbound NAT rules preventing NAT between
Server subnet and Client subnet, but NAT'ing all other - outbound
connections.
- Router has static route pointing to Server subnet via pfSense.
Router is an OpenBSD system (a CARP cluster, actually) running on
silly-overpowered hardware.
Client is actually multiple systems, ranging from laptops to
high-end servers.
Server is a Xeon E3-1230v3 running Linux, exporting a filesystem
via both NFS (v2, v3 & v4) and CIFS (samba).
pfSense is v2.1.5 (i386) on a dual P-III 1.1GHz, CPU usage
typically peaks at around 5%.
Performance on local Server subnet (i.e. from a same-subnet client) is
very good on all protocols, nearly saturating the gigabit link.
Traffic outbound from the server subnet to the internet (via Router)
moves at a decent pace, this firewall can typically handle ~400Mbps without
any trouble, IIRC synthetic benchmarks previously showed it can peak at
over 800Mbps.
Based on the FUBAR TCP window sizes I've observed, I assume pfSense is
doing something to my TCP connections... but why are only the non-NAT'd
connections affected? I know there's an option to disable pf scrub, but
that's only supposed to affect NFSv3 (AFAIK), and this also affects
NFSv4-over-TCP and CIFS.
--
-Adam Thompson
_______________________________________________
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
------------------------------
List mailing list
https://lists.pfsense.org/mailman/listinfo/list
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Paul Heinlein
2014-11-07 22:58:20 UTC
Permalink
Post by Adam Thompson
Ok, recap again...
- this affects multiple protocols, not just NFS. I've now confirmed
it affects SSH as well.
- this only occurs when the server is behind pfSense and the client
is on the "outside" of the firewall.
- this problem does not occur in the other direction through pfSense (LAN->WAN).
- to repeat myself, NFS works fine at ~1gbps between the same client
and server without pfSense in the middle.
Ergo, I conclude it's something pfSense-related. Haven't had a
chance to turn off of scrub yet.
I know you said that the CPU runs at ca. 5% load, but personally I'd
be unsure of a P-III-class machine at LAN speeds. What bus connection
do the NICs use? PCI? EISA? A 32-bit PCI bus operating at 33 MHz has a
theoretical maximum bandwidth of 133 Mb/s, and the 64-bit expansion
did little to improve that in any practical way. Plus, pre-MSI PCI
devices notoriously shared interrupts, slowing down device-to-devce
transfers. (And just to be cranky, I'll ask if any of the NICs in
shared PCI/ISA slots, which would squeeze performance even further.)

Have you tested that hardware in a routing capacity with non-pfSense
software?

Does the pfSense box have good DNS service?

Is the cabling flaky?

Is the pfSense box routing between subnets or just bridging? If the
former, what's there when pfSense is not "in the middle"? Another
router? Just a switch?
--
Paul Heinlein
***@madboa.com
45°38' N, 122°6' W
Adam Thompson
2014-11-07 23:25:08 UTC
Permalink
Post by Paul Heinlein
I know you said that the CPU runs at ca. 5% load, but personally I'd
be unsure of a P-III-class machine at LAN speeds. What bus connection
do the NICs use? PCI? EISA? A 32-bit PCI bus operating at 33 MHz has a
theoretical maximum bandwidth of 133 Mb/s, and the 64-bit expansion
did little to improve that in any practical way. Plus, pre-MSI PCI
devices notoriously shared interrupts, slowing down device-to-devce
transfers. (And just to be cranky, I'll ask if any of the NICs in
shared PCI/ISA slots, which would squeeze performance even further.)
Dual P-III 1.1GHz is adequate. The 32-bit PCI bus has a theoretical max
of 133 MBytes/sec, not 133 Mbits/sec, which is substantially faster than
gigabit. The PCI-X standard extended it to 66MHz @ 64bits, quadrupling
the theoretical max to ~533MBytes/sec, more than adequate for the
dual-port, MSI-capable PCI-X ethernet card in there right now.
Post by Paul Heinlein
Have you tested that hardware in a routing capacity with non-pfSense
software?
I've tested that machine with that pfSense software - the performance
hit only occurs in one direction.
Post by Paul Heinlein
Does the pfSense box have good DNS service?
Yes. Redundant resolvers are directly attached to pfSense's WAN subnet.
Post by Paul Heinlein
Is the cabling flaky?
No. As I've said several times, the performance hit only occurs in this
specific configuration. Performance is perfectly fine for NAT'd SSH and
HTTP sessions initiated from the LAN side.

It's not a NIC or cabling issue, for an additional reason: every routing
interface on the pfSense box is a VLAN on an LACP trunk. If it were a
cabling or NIC issue, *all* traffic would by definition be affected,
including downloads initiated from the LAN side.
Post by Paul Heinlein
Is the pfSense box routing between subnets or just bridging? If the
former, what's there when pfSense is not "in the middle"? Another
router? Just a switch?
Routing, since it does NAT. When pfSense is not in-circuit (as
described), I'm doing one of two things: moving the client (and/or
server) to another VLAN off the primary router, and/or moving the client
and server together onto the same subnet.

My own testing has demonstrated quite clearly that the massive
performance hit only occurs on TCP sessions going *inbound* from the WAN
to the LAN (relative to pfSense's view of the world).

For now, I've simply moved the server semi-permanently; this was an
unusual and temporary configuration to begin with.
--
-Adam Thompson
***@athompso.net
David Burgess
2014-11-06 23:05:34 UTC
Permalink
Post by Adam Thompson
Problem: really, really bad performance (<10Mbps) on both NFS (both tcp
and udp) and CIFS through pfSense.
In my experience, latency is the big buzzkill for CIFS. It seems like any
latency will slow things down, and the more you have, the worse it gets. I
assumed this has something to do with TCP window size, but I don't know.


I know there's an option to disable pf scrub, but that's only supposed to
Post by Adam Thompson
affect NFSv3 (AFAIK), and this also affects NFSv4-over-TCP and CIFS.
If it were my setup I would try disabling scrub just to see what the effect
is on transfer speed. If you see a difference then it gives you a place to
start.

db
Continue reading on narkive:
Search results for '[pfSense] terrible performance on NFS & CIFS' (Questions and Answers)
5
replies
Looking for a NAS device. Any recommendations?
started 2010-03-10 20:30:46 UTC
add-ons
Loading...