1.3.14: Numerous CLOSE_WAIT connections may cause out-of-con
Posted: Mon Jun 13, 2011 9:07 am
I'm using Gargoyle 1.3.14 with following setup:
1. ISP
2. ADSL modem
3. Gateway - DIR-655, WAN=PPOE, LAN=192.168.48.1 (DHCP server enabled)
4. Gateway - Gargoyle 1.3.14 on WRT-54GL v1.1, WAN=192.168.48.2 (static)
Symptoms:
1. The TCP NAT table within the DIR-655 contains numerous connections in CLOSE_WAIT state belonging to the Gargoyle (i.e. 192.168.48.2:*).
2. The DIR-655 SPI firewall frequently complains about invalid FIN:ACKs and RST:ACKs (i.e. they do not refer to valid connections).
Analysis:
1. DST IP+Port within FIN:ACKs and RST:ACKs on the DIR-655 refer to connections in CLOSE_WAIT initiated from the Gargoyle router.
2. SRC IP within within FIN:ACKs and RST:ACKs received by the DIR-655 from the Gargoyle router do NOT state the WAN IP of the Gargoyle, but instead reflect the IPs of arbitrary clients within the NAT of the Gargoyle (i.e. 192.168.1.*).
3. Googling for gargoyle close_wait reveals links to Gargoyle sources (e.g. http://www.gargoyle-router.com/gargoyle ... sh/uip.cpp) which contain following comment: "CLOSED and LISTEN are not handled here. CLOSE_WAIT is not implemented, since we force the application to close when the peer sends a FIN (hence the application goes directly from ESTABLISHED to LAST_ACK)"
Deductions:
1. As yet, it appears as though Gargoyle does not fully implement RFC793 state transitions.
2. When the remote peer closes a connection, Gargoyle prematurely purges the connection (instead of marking it CLOSE_WAIT).
3. At the time the NAT client sends its FIN:ACK, Gargoyle no longer has any NAT mapping for the connection and chooses to forward the FIN:ACK unchanged (i.e. IP + ports from INSIDE its NAT).
4. The peer receiving the invalid forwarded FIN:ACK is unable to identify the connection the FIN:ACK applies to. Thus the connection remains in CLOSE_WAIT at the peer. In this set up the peer is the upstream DIR-655 which will log the invalid FIN:ACK and chooses to apply its normal TCP in-session timeout (i.e. two hours) thus filling its NAT table with numerous CLOSE_WAIT entries.
5. As the peer did not receive the FIN:ACK, no ACK will be sent by the peer nor received by the NATted client, eventually causing the client to send a final RST:ACK in desperation.
6. This RST:ACK is as well forwarded by Gargoyle to the peer (again SRC fields indicating IP+port from WITHIN the NAT) which has no other option than to ignore (and log) the invalid RST:ACK.
7. Thus the peer (in this set up both the DIR-655 as well as the ultimate peer within the Internet) will never receive a confirmation for the connection to be closed.
TCP Connection terminations initiated by remote peers can only reach CLOSED state through time outs, possibly causing excessive resource consumption.
1. ISP
2. ADSL modem
3. Gateway - DIR-655, WAN=PPOE, LAN=192.168.48.1 (DHCP server enabled)
4. Gateway - Gargoyle 1.3.14 on WRT-54GL v1.1, WAN=192.168.48.2 (static)
Symptoms:
1. The TCP NAT table within the DIR-655 contains numerous connections in CLOSE_WAIT state belonging to the Gargoyle (i.e. 192.168.48.2:*).
2. The DIR-655 SPI firewall frequently complains about invalid FIN:ACKs and RST:ACKs (i.e. they do not refer to valid connections).
Analysis:
1. DST IP+Port within FIN:ACKs and RST:ACKs on the DIR-655 refer to connections in CLOSE_WAIT initiated from the Gargoyle router.
2. SRC IP within within FIN:ACKs and RST:ACKs received by the DIR-655 from the Gargoyle router do NOT state the WAN IP of the Gargoyle, but instead reflect the IPs of arbitrary clients within the NAT of the Gargoyle (i.e. 192.168.1.*).
3. Googling for gargoyle close_wait reveals links to Gargoyle sources (e.g. http://www.gargoyle-router.com/gargoyle ... sh/uip.cpp) which contain following comment: "CLOSED and LISTEN are not handled here. CLOSE_WAIT is not implemented, since we force the application to close when the peer sends a FIN (hence the application goes directly from ESTABLISHED to LAST_ACK)"
Deductions:
1. As yet, it appears as though Gargoyle does not fully implement RFC793 state transitions.
2. When the remote peer closes a connection, Gargoyle prematurely purges the connection (instead of marking it CLOSE_WAIT).
3. At the time the NAT client sends its FIN:ACK, Gargoyle no longer has any NAT mapping for the connection and chooses to forward the FIN:ACK unchanged (i.e. IP + ports from INSIDE its NAT).
4. The peer receiving the invalid forwarded FIN:ACK is unable to identify the connection the FIN:ACK applies to. Thus the connection remains in CLOSE_WAIT at the peer. In this set up the peer is the upstream DIR-655 which will log the invalid FIN:ACK and chooses to apply its normal TCP in-session timeout (i.e. two hours) thus filling its NAT table with numerous CLOSE_WAIT entries.
5. As the peer did not receive the FIN:ACK, no ACK will be sent by the peer nor received by the NATted client, eventually causing the client to send a final RST:ACK in desperation.
6. This RST:ACK is as well forwarded by Gargoyle to the peer (again SRC fields indicating IP+port from WITHIN the NAT) which has no other option than to ignore (and log) the invalid RST:ACK.
7. Thus the peer (in this set up both the DIR-655 as well as the ultimate peer within the Internet) will never receive a confirmation for the connection to be closed.
TCP Connection terminations initiated by remote peers can only reach CLOSED state through time outs, possibly causing excessive resource consumption.