Wireshark · Wireshark-dev: [Wireshark-dev] On building better statistics

Wireshark-dev: [Wireshark-dev] On building better statistics

From: Jaap Keuter <jaap.keuter@xxxxxxxxx>

Date: Mon, 14 Feb 2022 00:18:25 +0100

This discussion was brought on by issue 17877 titled “Non-visible photo items cannot change length after construction”. In there it is correctly stated that calls to set proto_item length (either proto_item_set_len() or proto_item_set_end()) are not effective when proto_items are not visible. When looking at these functions it can be seen that these are part of the group of functions which are optimised for faking proto_items when not visible.

Without going into the faking details (see the macro TRY_TO_FAKE_THIS_ITEM_OR_FREE in epan/proto.c for that) it comes down to just short cutting the proto_item creation process by returning the tree intended to attach the newly to be constructed proto_item to. In effect these all return the root node of the dissection tree.

A special case exception is put in place for proto_items of type FT_PROTOCOL. EPAN can be setup so that proto_items of this type in fact are allowed to be created, even if they otherwise would have been faked. This feature is used for protocol level statistics. The protocol level statistics ’tally the numbers’ by running the dissection, with a hidden tree, but with the special case exception set. This results in a very compact dissection tree, consisting of the root node and the proto_items of type FT_PROTOCOL. The length of these items is then used to determine the numbers to add to the various protocol layers in the statistics tree.

This is where the ambiguity comes in. Some protocols claim just there share of the octets in the frame (discussing Ethernet packet dissection here, to keep it simple). Others create their proto_item until the end of the TVB handed to them (usually to the end of the frame), and adjust the length after dissection of their fields have taken place and the variable number of bytes in the protocol layer has been determined. However, the functions to set these lengths don’t work when faking items is in effect.

As a result these protocols take up way more of the frame in the statistics than they in fact do. Overall more the 100% of the frame is allotted to the protocols contained in them. The User's Guide goes into this fact with the explanation that these protocol 'contain’ their payload, so that is why the payload is added to the protocol. That is one interpretation, but not really consistent because fixed size dissectors, which create their proto_items of type FT_PROTOCOL with fixed size, do not exhibit this behaviour.

The simplest step to take would be to allow the functions proto_item_set_len() and proto_item_set_end() to operate on proto_items of type FT_PROTOCOL if the afore mentioned special case exception was in effect. However, since faking of other types of proto_items is still in effect, all these other proto_items are now using the proto_items of type FT_PROTOCOL as proto_item, rather than the root node. This means that code setting the length of a field, is now also no longer blocked, and in fact setting the length of the proto_item of type FT_PROTOCOL rather than his own (which is faked).

A simple experiment with the file (code_mac_tagged.pcap) attached to issue 17877 makes this clear. Changing proto_items_set_len() to allow proto_items with type FT_PROTOCOL to set their length if the special case exception is set, shows a protocol hierarchy statistics page with all protocols matching their length in the dissection details pane. Except for COSE, the dissection details say 309 while the statistics say 246. This last length is the length set for the final part of the PDU, which ends up becoming the length of the protocol in the protocol hierarchy statistics.

The question now becomes how to proceed with this. Faking proto_items makes legitimate calls to set the length of proto_items of type FT_PROTOCOL indistinguishable from those calls meant for fields within those protocols. Another approach of faking proto_items by always returning the root node instead of the tree creates it’s own set of side effects. Now all ’non-protocol’ proto_items are processed for statistics too, and exposing things like expert info in the statistics also. This defeats the purpose of faking proto_items in the first place.

A ‘quick and dirty’ solution is to run dissection with a visible tree. This gives the desired results, but at the cost of doing a lot more dissection work than strictly necessary, voiding the whole purpose of the special case exception in not faking proto_items with type FT_PROTOCOL.

I’m not seeing a simple way out of this. Do we want to modify the statistics in this way is the fist question to answer. They will be different than in 3.6, 3.4 and earlier. This could be the right time to do it though. Then the question becomes how to achieve this without taking a significant performance hit by sidestepping the faking of proto_items.

Thanks,
Jaap

(not rereading this draft, it’s way too late for that. Sorry for any mistyping or confusing statements)

Follow-Ups:
- Re: [Wireshark-dev] On building better statistics
  - From: João Valverde

Prev by Date: Re: [Wireshark-dev] MSVC gives warnings "qt_ui_utils.cpp(208, 25): warning C4996: 'QProcess::startDetached'"
Next by Date: Re: [Wireshark-dev] On building better statistics
Previous by thread: Re: [Wireshark-dev] MSVC gives warnings "qt_ui_utils.cpp(208, 25): warning C4996: 'QProcess::startDetached'"
Next by thread: Re: [Wireshark-dev] On building better statistics
Index(es):
- Date
- Thread