Ethereal-dev: Re: [Ethereal-dev] Performance. Ethereal is slow when usinglargec aptures.

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Sat, 15 Nov 2003 23:49:50 -0800
On Sun, Nov 16, 2003 at 02:50:26AM +0100, Biot Olivier wrote:
> Just one perhaps stoooopid question, but are we already generating the UI
> entries for *all* packets while only showing a small portion of it?

Yes, we are generating the text for every column of every row, because
that's the way the GtkClist works. 

*IF* we had

	1) fast random access to arbitrary packets in compressed files;

	2) sufficiently fast dissection;

	3) a list display widget that, instead of having to be
	   pre-loaded with columns, called a callback routine to get
	   column text;

we wouldn't have to do that.

1) requires a different way of handling gzipped files, in which we
save the contents of the Lempel-Ziv string dictionary at "checkpoints"
in the stream - we could then go to an arbitrary offset in the file by
going to the checkpoint just before the offset and scanning forward
through the file from that checkpoint (and, as we'd decompress the data
into a buffer, as long as we're between given checkpoints, we wouldn't
have to do any decompression to move within the file).

2) would probably require a different data structure for the protocol
tree, with pointers to the first and the last entry at a given level of
the tree (one NIS capture I've seen is *very* large and takes *seconds*
to dissect, probably because it's scanning through a very long list
looking for the end).

3) doesn't exist in GTK+ 1.2[.x], but I have most of such a widget
implemented, based on GtkClist.  It does exist in GTK+ 2.x.

This would also mean that changing the format of the packet time stamp
column would take time proportional to the number of visible packet
summary rows, rather than proportional to the number of displayed
packets.

In addition, it would mean that the column strings for rows wouldn't be
stored in memory, which could significantly reduce the memory
requirements for a capture.

It *would* mean that sorting on the Protocol or Info column might be
significantly slower (probably prohibitively so for large captures), but
I suspect the number of people who would be hurt by that would be
significantly lower than the number of people who would be helped by
speeding up other operations such as reading files in.  (If people
*REALLY* care about making that fast, we could conceivably, when you
tried to sort on that column, make a pass through the capture generating
that data, saving the column data, and sorting on that - but that would
run the risk of running out of memory when trying to generate the column
data.)