Wireshark-dev: Re: [Wireshark-dev] [Wireshark-commits] rev 49551: /trunk/epan/ /trunk/epan/: re

From: Evan Huus <eapache@xxxxxxxxx>
Date: Thu, 23 May 2013 22:58:32 -0400
On Thu, May 23, 2013 at 10:31 PM, <eapache@xxxxxxxxxxxxx> wrote:
>
> http://anonsvn.wireshark.org/viewvc/viewvc.cgi?view=rev&revision=49551
>
> User: eapache
> Date: 2013/05/23 07:31 PM
>
> Log:
>  Add an optimization to req_resp_hdrs_do_reassembly that shaves about 20% off
>  the load time of one of my sample captures that is HTTP-but-not-really.
>
>  Also add modelines.
>
> Directory: /trunk/epan/
>   Changes    Path               Action
>   +23 -0     req_resp_hdrs.c    Modified

The capture in question (which I can't share unfortunately) still
takes far too long to load. Trunk builds take ~8 seconds (~10 before
this commit) whereas 1.8 builds load it in about 0.6 seconds.

It consists of many thousands of packets containing very short
payloads that look enough like HTTP to fool the heuristics, but never
end "properly" (with an \r\n \r\n). The desegmentation logic loops
through each line, then requests more reassembly because it hasn't
seen an \r\n \r\n.

The TCP layer obliges, sticks another couple of bytes (one short
payload) on the end and sends the new packet back to HTTP - which
reprocesses the entire thing again from the beginning. This leads to a
quadratic-time processing of the packets (limited by how many frames
TCP is willing to jam together).

The following patch seems to fix the issue (brings trunk back down to
~0.7 seconds):

Index: epan/req_resp_hdrs.c
===================================================================
--- epan/req_resp_hdrs.c    (revision 49551)
+++ epan/req_resp_hdrs.c    (working copy)
@@ -107,7 +107,7 @@
              * won't help, as those bytes weren't captured).
              */
             if (reported_length_remaining < 1) {
-                pinfo->desegment_offset = offset;
+                pinfo->desegment_offset = next_offset_sav;
                 pinfo->desegment_len = DESEGMENT_ONE_MORE_SEGMENT;
                 return FALSE;
             }
@@ -127,7 +127,7 @@
                  * Not enough data; ask for one more
                  * byte.
                  */
-                pinfo->desegment_offset = offset;
+                pinfo->desegment_offset = next_offset_sav;
                 pinfo->desegment_len = DESEGMENT_ONE_MORE_SEGMENT;
                 return FALSE;
             }

However, I strongly suspect that this would break real HTTP
desegmentation. I'm at a bit of a loss as to why this isn't showing up
in 1.8 as well - req_res_hdrs.c hasn't been touched in ages.

Any insight would be appreciated.

Thanks,
Evan