Wireshark-dev: Re: [Wireshark-dev] Extending time before 2nd analysis pass

From: Peter Wu <peter@xxxxxxxxxxxxx>
Date: Sun, 28 Apr 2019 05:25:56 +0100
Hi Darien,

On Fri, Apr 26, 2019 at 12:02:23PM +0200, Darien Spencer wrote:
>    Hey everyone
> 
>    I wrote a custom C dissector involving my own re-assembly logic.
>    The problem I'm dealing with is:
>    When capturing from high-capacity interfaces sometimes the segments to re-assemble arrive out of order
>    and the reorder and reassembly code is either not quick enough (for "late" segments) or it's too quick (for "early" segments, at 2nd pass) for the dissector to spot the entire segments
>    sequence.
>    In this case the reassembly logic breaks and doesn't show the right payloads.
> 
>    Also, when reading from a capture file the problem does not exist.
>    I believe this is because the "2nd pass" is actually happening after all the packets has been processed (and segmetns registered) once.

The first pass occurs when a record is read:

- cf_read (for offline captures) / cf_continue_tail (for live captures)
  - read_record (file.c)
    - add_packet_to_packet_list
      - epan_dissect_run_with_taps
        - epan_dissect_run (epan/epan.c)
          - dissect_record

The "second pass" is every other dissection attempt after that. That
includes:

1. Retrieving details for color filter
  (observation: does not seem to be called for live captures?)
  - PacketListModel::ensureRowColorized
    - PacketListRecord::columnString
      - PacketListRecord::dissect
        - epan_dissect_run

2. Retrieving details for the current selected packet.
  - PacketList::selectionChanged
   - cf_select_packet
     - epan_dissect_run

3. Retrieving the column contents for the packet list:
  - PacketList::drawRow
    - ...
      - PacketListModel::data
        - PacketListRecord::columnString
          - PacketListRecord::dissect
            - epan_dissect_run

So unless your "PINFO_FD_VISITED" ("second pass") case modifies the
state in the second pass, it should automatically pick up the results
from the first pass when you change packets.

Actually... I realize that with a live capture, the following assumption
might not hold:

    All packets have been processed through the first pass before the
    second pass is handled.

or (rephrased):

    The second pass is only called when the first pass has completed.

Clearly that is not true in the live capture case. I observe that for an
offline capture, the first pass finished completely (in sequence)
followed by other passes. For a live capture however, it alternates
between "first pass" and "second pass". Instead of visiting frames 1, 2,
3, 1, 2, 3, it would visit them as 1, 1, 2, 2, 3, 3.

This might be affecting you. I do know that this breaks some of my
previous assumptions. Meh.

>    I tried to tackle this by improving my reassembly code but that dissection is just too complex and I failed.
>
>    So I'm looking for a workaround and I realized that if I could delay the 2nd pass on live analysis this could "buy the packets time" to be processed. (Note I have no intention to submit such
>    change to gerrit, just for m)
> 
>    Is this even possible? I am familiar with dissectors code but I don't know where to start looking for the 2nd pass code in WS's repo.
>    Also I'm open for other clever ideas on how to tackle this issue :)

The "second pass" (actually, the "n pass" for n greater than 1) is just
any dissection with the "visited" flag set to 1. This occurs when a
packet record is processed in epan/epan.c.

Delaying the second pass implies hiding packets from the GUI. In theory
it could be done, but it will probably not be easy. And you are trading
off latency for accuracy. Perhaps the reassembly routines could be
improved to handle this mixed 1-pass/2-pass case.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl