Wireshark-dev: Re: [Wireshark-dev] Splitting packet_info struct for performance reasons?

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Sun, 1 Mar 2009 15:31:49 -0800

On Mar 1, 2009, at 12:45 PM, Kaul wrote:

I was astounded with the huge size of packet_info structure. I believe in 99% of the cases, there is no need for many of the fields within the structure. Wouldn't it make sense, for performance reasons, to leave the most usable ones within it, and create a pointer to an extra structure with the other, less commonly used stuff. I'm pretty sure it'll save memory and probably improve performance, but would require some changes and complicate a bit the programming. Thoughts?

packet_info has become the grab bag of packet information in Wireshark (heck, that dates back to when it was still called Ethereal).

There are several issues here:

1) we do most columns in the packet list by saving the binary information and formatting it only when we add the packet to the list (to avoid formatting strings in one dissector only to replace that string in a higher-layer dissector - that actually did save some CPU time);

2) we use packet_info to handle "dissect as", so that the GUI can determine to which flow (at a given protocol layer) the packet belongs;

3) we might also use this as a way to pass information between dissectors, e.g. addresses and ports.

The column handling needs some rethinking, given that:

	1) crap piles up in packet_info to implement this;

2) in order to add support for a new column, code has to be changed in a number of places.

"Dissect as" could also use some rethinking, to generalize it. In most forms of "decode as", the user wants to decode a particular flow, at a given protocol layer, as a particular protocol above that layer, e.g.:

decode a particular TCP connection (pair of address/port endpoints) or UDP flow (pair of address/port endpoints, although perhaps with wildcards and/or adding extra packets, e.g. the first packet of a TFTP session);

decode a particular DCE RPC session (session atop which DCE RPC runs, plus DCE RPC context ID);

and there are probably other potential examples (ATM virtual circuit, L2TPv3 session, etc.). This might tie in with some changes to the conversation mechanism I suggested in

	http://www.wireshark.org/lists/wireshark-dev/200902/msg00315.html

Passing information between protocols could also use some rethinking - currently, we have pinfo->private_data, but in some cases dissectors can step on each other's toes; perhaps what we need is a stack of private information, with the items tagged by the protocol that put them there or the protocol for which they're intended.

Semi-random examples:
guint16 src_idx; /* Source port index (Cisco MDS- specific) */ guint16 dst_idx; /* Dest port index (Cisco MDS- specific) */
  guint16 vsan;                 /* Fibre channel/Cisco MDS-specific */

(how many of us need the above?)

Those are there for columns.

  guint16 link_number;

I'm not sure what that's there for. It stores an MTP2 link number, but the only code that sets it is in dissect_frame() and the only code that *uses* it is in dissect_frame(); that code sets it in the frame information part of the protocol tree - it arguably belongs instead in the MTP2 part instead - rather than in a column. I guess it's a little more complicated if MTP2 is encapsulated in another protocol and that protocol provides the link number; that starts looking like passing information between dissectors.

  guint8  annex_a_used;

Similar to link_number, although it is used in the MTP2 dissector - which should probably just dig it out of the pseudo-header or, again, if there's encapsulation in another protocol, from a passing-between- dissectors mechanism.

guint16 profinet_type; /* the type of PROFINET packet (0: not a PROFINET packet) */

Looks as if it's used to pass data between dissectors.

(It it even 16bit aligned?!)

Yes. By default, C compilers align structure members on the appropriate boundary, with padding (otherwise they either have to generate less-efficient code to access unaligned members or generate code that the processor might handle less efficiently); it's only with compiler-specific pragmas or attributes or... that you get "packed" structures with unaligned members.