Wireshark-users: Re: [Wireshark-users] Running tshark on large pcap files

From: Sake Blok <sake@xxxxxxxxxx>
Date: Mon, 10 Jun 2013 09:53:39 +0200
On 10 jun 2013, at 09:14, Rayne wrote:

> I'm running tshark on a few large pcap files (each over 100GB in size) to extract packets belonging to a particular TCP/UDP port and write them to a file.
> 
> I noticed that when tshark first starts, it uses about 90-100% of the CPU, and the processing is pretty fast. However, as it continues, it uses more and more of the memory (the server has ~8GB of RAM) and eventually, the CPU load is down to 1% or less when it's using almost all of the memory. And it takes days to process one pcap file. I had to stop the processing because it was taking too long.
> 
> Does this behavior have anything to do with how tshark works on pcap files? Does tshark try to load the pcap into memory, and when memory runs out, it slows to a crawl? Is there any way I can make tshark run faster?

Tshark won't load the whole file, but it will keep state of sessions it has seen. So it's memory consumption will grow over time. On a 100GB tracefile, I suspect it will run out of physical memory and going to use swap, hence the slowdown.

If all you need is TCP/UDP port filtering, you are better of with tcpdump, it does not keep state and the BPF filter engine is pretty fast in filtering. You could use:

tcpdump -r infile.pcap -w outfile.pcap "tcp port 80 or tcp port 8080 or udp port 53"

Or something similar to your needs.

Cheers,
Sake