Ethereal-dev: Re: [Ethereal-dev] Packets Spanning Packets

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <guy@xxxxxxxxxx>
Date: Thu, 1 Mar 2001 12:30:46 -0800 (PST)
> From: Guy Harris [mailto:guy@xxxxxxxxxx]
> > > Or, more to the point, how difficult would it be to implement?
> > Fairly difficult.  See my recent reply in the thread on Van Jacobson
> > compressed PPP for *some* of the issues that have to be resolved in
> > order to support dissection of packets that span frames.
> 
>   The code in follow_dlg.c and follow.c seem to be fairly promising:

...in the same sense that the code in the UNIX "fmt" utility might be
promising if you were writing an nroff/troff clone - they do some of the
same things, but they don't do *all* that's necessary.

> they
> sort and uniquify the packets in the TCP stream -- which (unless I'm missing
> something here) is exactly what needs to be done to parse higher level
> packets spanning TCP packets. 

If by "exactly" in exactly what needs to be done" you mean that nothing
*more* is needed, then you're missing a number of things; see the mail
message to which I referred for the various ways in which the UI code,
for example, would have to be changed to cope with packets taking more
than one frame.

Trust me on this one, I've been working on the innards of Ethereal for
years, and thinking about these issues for years (heck, wanting a packet
analyzer that could do that was what got me interested in developing
packet analyzers in the first place) - *it's more work than you might
think*.

>   I haven't done too much digging about, but could the functions in follow.c
> be cleanly modified to create an index of packets in a stream? Then maybe
> something like a the tvbuff could be used to walk across them in a manner
> that would shelter the code monkey from knowing about the underlying
> packets...

Composite tvbuffs were implemented precisely to handle that sort of
situation, but you're going to need a heck of a lot more than just "an
index of packets in a stream".

For one thing, as I mentioned in the mail message to which I refer in
this mail and referred in the previous mail, Ethereal does *NOT* always
treat the packets as a sequential stream of packets; its *first* pass
through the frames is sequential, but once that has been done, the user
can select frames in whatever random order they choose, so if there's a
Gnutella packet spread across frames 12 and 13, you'd better be able to
handle the user clicking on frame 13 *without* having clicked on frame
12 before that.

This means you'd have to associate with a frame that's part of a
multi-frame packet enough state information to allow the dissector to
find the rest of the frames that comprise the packet *and* to reassemble
the *parts* of those frames that comprise the packet into a tvbuff to
allow the packet to be dissected.

You'd also have to deal with the *other* issues I mention in the mail
message in question, *and* you'd have to figure out how to make this
work for Tethereal, which makes only one pass through the capture file,
when it's printing out dissections of the frames - that might involve
deferring the dissection of frames, and printing of that dissection,
until, for a given frame, all the frames that comprise all the packets
that contain data from that frame have been read.

I.e., instead of the Tethereal mail loop being something that amounts to

	for (;;) {
		read a frame;
		dissect a frame;
		print the dissection;
	}

when it's dissecting and printing, it might be more like

	for (;;) {
		read a frame;
		process the frame;
	}

where "process the frame" is:

	hand the frame to the top-level dissector;
	for (all frames prior to this one the printing of which we've
	    deferred) {
		print the dissection of the frame;
		free up any saved data we no longer need;
	}
	if (we don't have to defer the full dissection of this frame)
		print its dissection;

*Then* you'd have to decide how to handle

	1) an Ethereal "update list of packets in real time" capture

and

	2) a Tethereal "do a live capture and print the packets as
	   they're captured rather than saving them to a file" capture.

Should we do the deferral in question, or should we run in a special
"don't bother reassembling, just dissect what we can get from a single
frame" operation?  If we do the latter, we *will* have to do the
reassembly eventually, in an Ethereal "Update list of packets in real
time" capture, when the capture is stopped.

> (I'm going to have to install a real OS soon if this seems easily doable =)

It's not, as noted above (and in various other mail messages to the
Ethereal lists over the past few years, discussing this topic), so
there may be hurry to install a real OS.