Wireshark-dev: Re: [Wireshark-dev] Detecting Protocol Headers

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Mon, 9 Mar 2009 18:58:26 -0700

On Mar 9, 2009, at 6:09 PM, Rayne wrote:

I took a look at packet-udp.c and packet-ip.c, and am wondering where I can find the definitions of the following functions:

call_dissector()
dissector_add()
dissector_try_heuristic()
dissector_try_port()
register_dissector_table()
register_heur_dissector_list()

epan/packet.c

and the following structures:
dissector_table_t
heur_dissector_list_t
dissector_handle_t

epan/packet.c and epan/packet.h

Also, where are the UDP ports and list of heuristic dissectors tried by the UDP dissector defined?

The tables of ports and of heuristic dissectors are defined in the UDP dissector.

Those tables are filled in by other dissectors.

From what I can understand from packet-udp.c, the structures udp_dissector_table and heur_subdissector_list are first defined and registered in the file packet-udp.c itself.

Yes.

So how would the UDP dissector know which sub-dissector and UDP ports to try next in order to call the next dissector?

It looks in those tables.

The process of registering dissector modules is a two-step process.

In the first phase, the proto_register_ routines for all dissectors are called. They register the protocol, the fields for the protocol, and other things, including any dissector tables or heuristic dissector lists for that protocol.

In the second phase, the proto_reg_handoff_ routines for all dissectors are called. They register the dissectors in the dissector tables and heuristic dissector lists created in the first phase.

Also, are the dissectors in the heuristics list determined by statistics?

No.

For example, if say Protocol A follows Protocol B 80% of the time from traffic observed,

Which traffic?

At one site, protocol A might follow protocol B 80% of the time. At another site, it might follow it 0% of the time, because, at that site, protocol A might not be used at all.

then Protocol A is included in the heuristic list of dissector to try by Protocol B?

If

	1) there's a dissector for protocol A;

	2) protocol A can follow (be encapsulated in) protocol B;

3) you can't tell by looking at some "next protocol" field in protocol B whether it's followed by protocol A or not, you have to guess by looking at the payload of protocol B;

4) protocol B supports heuristic dissectors for protocols that follow it;

5) whoever wrote the dissector for protocol B knew all that and knew that they should therefore make the dissector for protocol A a heuristic dissector for protocol B;

then the proto_reg_handoff_ routine for the dissector module for protocol A would register the dissector for protocol A in the list of heuristic dissectors for protocol B.

And am I right to say that the protocol tree is built before the first packet is captured,

No, because there's no such thing as "*the* protocol tree" in general; a packet has *a* protocol tree that shows the dissection of all the protocols in that packet, so one can speak only of "the protocol tree" for a given packet, which obviously can't be created until that packet has been read. (Note that the packet might be captured minutes, or hours, or days, or weeks, or months, or years... before the packet is read; it might have been written to a capture file when it was captured, and Wireshark or TShark might be reading the file much later.)

Where can I find an example where dissect-protocol() is called?

What do you mean by "dissect-protocol()"?

I also noticed that in packet-ip.c, the function dissector_try_port() is called. However, it appears that the "port" used here is the protocol field.

Correct. The name dissector_try_port() is historical; it should really be called dissector_try_uint(), or something such as that, as its argument is an unsigned integer value. There's also a dissector_try_string() routine, for dissector tables where the key is a string rather than an unsigned integer.

Without seeing the definition for dissector_try_port(), I'm guessing that the second argument of this function is the search critieria,

Correct.

and for UDP (and presumably TCP), it's the source/destination ports,

Yes - in one call to dissector_try_port() in the UDP and TCP dissectors, it's the lower-valued of the source and destination port numbers, and in the other call to dissector_try_port(), it's the higher-valued of the source and destination port numbers.

whereas for IP, it's the protocol field. Is this correct?

Yes (and in the Ethernet dissector, it's the Ethernet type field, for example).