Wireshark-users: Re: [Wireshark-users] [semi-OT] request second opinion on possible bugs in OS TC

From: Sake Blok <sake@xxxxxxxxxx>
Date: Mon, 17 Jan 2011 00:14:06 +0100
On 16 jan 2011, at 17:00, Alan Tu wrote:

> If after reading this and you're interested in
> helping, please e-mail me individually and I'll reply with the PCAP.

Thanks for sending the PCAP.

> [...]
> This is my assessment of what is going on:
> Frame 1-3: three way handshake, normal
> Frame 4: client sends HTTP GET request, normal
> Frame 5: server ACK frame 4, normal
> Frame 6: server sends payload segment 1 (PS1), normal
> Frame 7: server sends PS2, normal
> Frame 8: client ACK PS2, normal
> For some reason, frame 8 is not received or processed by the server
> (this is a mystery, but not discussed here.)

IMHO, this mystery is what needs to be investigated, as this is the cause of the problem. Here follows my analysis which backs up that statement :-)

> Frame 9: server resends PS1, normal
> ***Frame 10: client receives frame 9, a duplicate of frame 6. Client
> ACK frame 7, but sends a SACK with the segment from frame 6.
> This is clearly incorrect behavior, ref the SACK RFC, RFC2018. The
> client is treating frame 9 as an out of order packet and jumping into
> SACK mode, but frame 9 is merely a duplicate or retransmit. Frame 9
> falls outside the client's receive window (updated after frame 7) and
> should discard it, but doesn't. My theory is that the client (Symbian
> OS TCP stack) is not doing a bounds check on its TCP receive window.

According to the RFC:

"If the data receiver generates SACK
   options under any circumstance, it SHOULD generate them under all
   permitted circumstances."

So it is obligated to use the SACK option when ACKing the retransmission.

Also from the RFC:

"The first SACK block (i.e., the one immediately following the
      kind and length fields in the option) MUST specify the contiguous
      block of data containing the segment which triggered this ACK,
      unless that segment advanced the Acknowledgment Number field in
      the header.  This assures that the ACK with the SACK option
      reflects the most recent change in the data receiver's buffer

This means it has to SACK the block that has just been received and it does.

> Frame 11: The server TCP stack has received an invalid SACK and is now
> confused. It retransmits PS1. This is semantically incorrect because
> the client actually indicates it has received PS1.

*If* the server TCP received the ACK with SACK. But I don't think it did. If you use the filter "tcp.srcport==80", you can see clearly that it keeps retransmitting the same segment with an increasing retransmission timeout. This is the behavior of a system that does *not* receive any ACKs.

> Frame 12: client retransmits frame 10
> Frame 13: server retransmits PS1
> Frame 14: client retransmits frame 10
> Frame 15: server retransmits PS1
> Frame 16: client retransmits frame 10

Actually, the client keeps ACKing the received frame, hoping to reach the server and make it send new data.

> Frame 17: server sends PS3, normal
> Somehow, this "breaks the spell", for the moment.

The somehow could be explained by a Keep-Alive timer on the server. As can be seen in the HTTP data, both the client and the server want to use Keep-Alive, so neither of the two should close the connection until a timeout has been reached or the maximum configured objects have been served over the same TCP connection. 

Since the server waited on an ACK after sending two full frames, it will not send all the data at once without waiting on ACKs. So the fact that it sends all the data at once and the fact that is closes the connection with a FIN tells me it is flusing its send buffer after the http daemon has told it to close the connection due to a 15 sec idle timeout.

> Frame 18: server sends PS4, normal
> Frame 19: server sends PS5, normal
> Frame 20: server sends PS6, normal
> Frame 21: server sends PS7, normal
> Frame 22: server sends PS8, normal
> Frame 23: server sends PS9, normal
> Frame 24: server sends PS10, normal
> Frame 25: server sends PS12 with FIN, received out of order
> Frame 26: server sends PS11, received out of order
> Frame 27: client ACK PS4, normal
> Frame 28: client ACK PS6, normal
> Frame 29: client ACK PS8, normal
> Frame 30: client ACK PS10, normal
> Frame 31: client ACK PS10, but sends a SACK saying it has received PS12, normal
> Frame 32: client ACK PS12, normal
> Frame 33: client sends FIN/ACK, acknowledging server's FIN from frame 25, normal
> At this point, the client is expecting an ACK to its own FIN.
> Frame 34: for some reason, the client does not receive an ACK to its
> FIN, so two seconds later it retransmits a FIN/ACK, normal

Well, since the server does not seem to receive the packets of the client, it will never respond to these FINs.

> ***Frame 35: server resends PS1, 2.7 seconds after the client sends
> the first FIN in frame 33
> Why oh why does the server (unknown OS) do this? the SACK storm from
> earlier seemed to have broken, the client has acknowledged all the
> later payload segments, the server has sent its FIN, and the client
> has sent its FIN (twice.) PS1 should be out of the server's TCP send
> window anyway.

It should... *IF* it ever received an ACK from the client.

So the main question is... why do the packets from the client never reach the server? Or do they reach the server in a transformed state and get discarded by the server?

Hope this helps,
