Wireshark-dev: Re: [Wireshark-dev] Wireshark memory handling

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Mon, 5 Oct 2009 11:23:42 -0700

On Oct 5, 2009, at 8:01 AM, Håvar Aambø Fosstveit wrote:

We are a student group from the university of Science and Engineering in Norway, and are having a project on handling large data sets and specifically Wiresharks issues with it. I have included part of our prestudy into the problem as an attachment, and we are wondering if anybody has som immediate
thoughts regarding our plans for a sollution.

The paper says

	Since exhausting the available primary memory is the problem ...

What does "primary memory" refer to here?

It later says

	An alternative for getting more memory than the machine's RAM is to use
memory-mapped files.

so presumably "primary memory" is referring to main memory, not to the sum of main memory and available backing store ("swap space"/paging files/swap files/whatever the OS calls it, plus the files that are mapped into the address space).

Presumably by "more memory than the machine's RAM" you mean "more memory than the machine's RAM plus the machine's swap space" - all the OSes on which Wireshark runs do demand paging, so Wireshark can use more memory than the machine has (unless the OS requires every page in RAM to have a swap-space page assigned to it, in which case it can use max(available main memory, available swap space)).

In effect, using memory-mapped files allows the application to extend the available backing store beyond what's pre-allocated (note that OS X and Windows NT - "NT" as generic for all NT-based versions of Windows - both use files, rather than a fixed set of separate partitions, as backing store, and I think both will grow existing swap files or add new swap files as necessary; I know OS X does that), making more virtual memory available.

The right long-term fix for a lot of this problem is to figure out how to make Wireshark use less memory; we have some projects we're working on to do that, and there are some additional things that can be done if we support fast random access to all capture files (including gzipped capture files, so that involves some work). However, your scheme would provide a quicker solution for large captures that exhaust the available main memory and swap space, as long as you can intercept all the main allocators of main memory (the allocators in epan/emem.c can be intercepted fairly easily; the allocator used by GLib might be harder, but it still might be possible).