Ethereal-dev: Re: [Ethereal-dev] Memory allocation witchhunt??

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <gharris@xxxxxxxxx>
Date: Thu, 28 Apr 2005 13:27:42 -0700
Visser, Martin wrote:

(Curiously, if I try to exit Ethereal, the mem
usage climbs up to 200MB, and goes up and down swapping madly before
eventually I give and just kill the process.)

See below for one thing that I think causes this (destroying the widget that displays the list of packets requires it to free the strings it allocated for each column of each row, and that can take a while as it drags tons of stuff into the cache).

I know I could get more than my current 512MB RAM for not a whole lot of
money, but I guess one always has to stop somewhere. Also I know that
you can do a lot by simply streaming through tethereal. But does anyone
see any value in going on a memory witchhunt? I assume that memory is
mainly chewed up by the dissected structures. Are their any efficiencies
to be made here?

One thing that would, I suspect, make a significant difference would be a change to make the widget that displays the packet list not, when you add a row to the list, make copies of the text in all the columns, but instead calls back to a routine to get the contents of the columns.

I have a GTK+ widget, derived from the GTK+ 1.2.10 GtkClist widget, that does that; I haven't had a chance to work on it for a while, and don't remember what more needs to be done on the widget - other than making it work on GTK+ 2.x as well. (There is such a widget in GTK+ 2.x; however, I think it might be significantly slower than the GtkClist widget in other ways.)

The best way to use such a widget is to have the callback routine read and dissect the packet in question, generating the column values (but not the protocol tree). *If* random access to a capture file is reasonably efficient, that would *probably* work reasonably well when scrolling the packet list (especially if rapid scrolling causes the toolkit to compress updates, so that if a rapid scroll turns into a bunch of jumps, we don't try to dissect *all* packets that are nominally being dragged into view because they're not actually dragged into view).

Unfortunately, currently, although random access to an uncompressed file should be reasonably efficient, random access to a gzipped capture file is extremely inefficient. I think it can be made efficient - somebody at NetApp who'd implemented code to read compressed core dump files for NetApp appliances said you can just save in memory the state of the decompression engine at various "checkpoints", so that to go to an arbitrary place in the file you go to the checkpoint before that place that's closest to that place and then move forward - but that hasn't been done yet. (Doing so means that we couldn't use zlib to read compressed files, as I don't think it has a way to get that decompression engine information and set the state of the decompression engine based on stored information. If we do it ourselves, that would simplify the configuration script code that currently copes with testing for zlib and for various zlib deficiencies and bugs, it'd mean that Ethereal would always support compressed files, and it means we could suppress the checksum checking when reading compressed files from, I think, Windows Sniffer - or maybe it was Shomiti Surveyor - which doesn't write out the standard gzip checksum at the end of the file.)

This might also probably mean that sorting by a column value could be *really* slow, depending on how many comparisons are done by the sorting process, as each comparison might have to generate dissect both packets. There are probably ways of making that faster.

Another way to use such a widget, which wouldn't save as much memory, but would avoid the need to change the way we handle compressed files and column sorting, would be to, when generating the packet list, save the Protocol and Info column values as strings, and either save the address column values as strings or save addresses in a data structure (so that only one copy of each address seen is stored, with the structure also having a pointer to a resolved name) and store pointers to the address in question in the frame_data structure. (That might also provide infrastructure for saving and restoring address-to-name tables.)

I also notice that when you say run protocol hierarchy stats, you still
have to run through all the dissectors again anyway, so is some of the
stored info wasted anyway?

It shouldn't be. If the state already exists, dissectors shouldn't be re-generating it; most if not all don't re-generate it. If a re-dissection is done (e.g., after changing a protocol preference), the state should be discarded and re-generated, as it might change.

I know that Richard Sharpe (and maybe others) occasionally run Ethereal
through a profiler to look for CPU hogs. I guess I wonder if (and how )
there should also be memory profiling done as well?

Just checking for leaks would help. There's a "leaks" tool in OS X; I did some checking for leaks, found a few, and plugged a few, but didn't have time to go further.

"leaks" can't, as far as I know, report on non-leaked memory, but there might be tools to do that. It'd definitely be worth doing.