Wireshark-dev: Re: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'

From: Anders Broman <a.broman@xxxxxxxxxxxx>
Date: Thu, 18 Apr 2013 22:59:03 +0200
Evan Huus skrev 2013-04-18 22:40:
On Thu, Apr 18, 2013 at 3:56 PM, Jeff Morriss <jeff.morriss.ws@xxxxxxxxx> wrote:
On 04/18/13 15:14, Evan Huus wrote:
This is a tangential issue that has always confused me.

Why do we malloc+memcpy data for reassembly when we already have
'virtual' composite TVBs?

Wouldn't it be more efficient (in time and memory) to create a
composite TVB for each reassembly and then build the reassembled
packet in it? You would never have to copy or allocate any actual
packet data...

There are a couple of problems with doing that (that I recall):

1) Composite TVBs don't actually work (or didn't work until very recently?).

2) The data behind a TVB goes away as soon as we're done dissecting (and
displaying) the packet.  That is, the TVB data is overwritten (IIRC) when
the next packet is read.

I suppose there was never any real reason to try to make reassembly work
with composite TVBs: if they're just more malloc()'d memory then why mess
with it rather than allocate our own copy of the data?  (Well, OK, it would
save a data copy, but...)
OK, so then the optimal case would be a tvb implementation that stored
only frame_data pointers, offsets and lengths... similar but not
identical to the current composite implementation.

The reassembly code could then add meta-data to this when
reassembling, and the tvb could lazily refetch the underlying tvbs
using the existing wiretap interface? If we add some sort of caching
mechanism so that repeated accesses didn't keep forcing reads of the
original file then I expect this would be very fast:

- adding fragments to reassembly would be near-instantaneous (just a
few pointer updates)
- reassembled tvbs would take minimal memory except when accessed
(using tvb_get_* or proto_tree_add_*)
- accessing a reassembled tvb would just be an offset calculation and
then a wtap read to bring into memory the underlying real packet(s)
containing the data being requested (assuming they aren't already
cached)

Thoughts?
If on top of that small enough files could be mmaped it'd be even faster.
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
              mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe