Wireshark-dev: [Wireshark-dev] Packet Loss due to Disk Contention with Running Dumpcap in a hig

From: John Powell <jrp999@xxxxxxxxx>
Date: Wed, 12 Dec 2012 13:33:22 -0600
Hi Everyone,

I am using DUMPCAP to capture packets in a high packet rate environment.

My operating system is: CENTOS 6.3

I am experience this problem on source compiled versions:  wireshark-1.6.12 and wireshark-1.8.4.

In order to allow DUMPCAP to be run as a NON-ROOT user I am using the following:
  • setcap 'CAP_NET_RAW+eip CAP_NET_ADMIN+eip' /usr/local/bin/dumpcap -v
The issue is that I am experiencing packet loss to apparent disk contention when writing the packets to the disk - see attached file: packet-loss-atop.txt

To help alleviate the problem I have tried the following:
  • Disabled SELINUX
  • Disabled AUDIT
  • RAID 0 (striped disks) to load share the writing out of the data
    • ARRAY /dev/md2 level=raid0 num-devices=2
         devices=/dev/sda4,/dev/sdb4
  • Turn off journals on ext4
    • tune2fs -o journal_data_writeback /dev/md2
    • tune2fs -O ^has_journal /dev/md2
    • change fstab to:
      • UUID=.. /data   ext4    defaults,data="" style>         0 0
  • Use -B option on Dumpcap to buffer the data
    • root      /usr/local/bin/dumpcap -B 16 -i 2 -f vlan and (not vrrp and not udp port 1985 and not ether host 01:00:0c:cc:cc:cc) -g -b filesize:250000 -b duration:900 -w /data/eth1.cap
These changes have increased the throughput but I still experience packet loss - see attached IO Graph: packet-loss-io-graph.jpg

The Vendor solutions we have looked at will not decode UNISTIM signalling properly which is requirement for this tool.

Any suggestions on how to better configure either the operating system or wireshark to increase packet capture throughput will be greatly appreciated.

Thanks in advance for your assistance.

-John
ATOP - stc0033281                                  2012/12/12  12:35:16                                  ------                                    1s elapsed
PRC | sys    0.23s | user   0.18s | #proc    216  | #trun      2 | #tslpi   315 | #tslpu     1 | #zombie    1 | clones     0  |              | #exit      0 |
CPU | sys      17% | user     13% | irq       1%  | idle    116% | wait     53% |              | steal     0% | guest     0%  | curf 2.58GHz | curscal  81% |
cpu | sys      14% | user     12% | irq       1%  | idle     69% | cpu001 w  4% |              | steal     0% | guest     0%  | curf 2.00GHz | curscal  63% |
cpu | sys       5% | user      1% | irq       0%  | idle     49% | cpu000 w 44% |              | steal     0% | guest     0%  | curf 3.17GHz | curscal 100% |
CPL | avg1    0.80 | avg5    0.80 |               | avg15   0.81 |              | csw    25366 | intr   16698 |               |              | numcpu     2 |
MEM | tot     3.6G | free  155.2M | cache   3.1G  | dirty  84.2M | buff    2.3M | slab  196.6M |              |               |              |              |
SWP | tot     5.9G | free    5.0G |               |              |              |              |              |               | vmcom   1.7G | vmlim   7.7G |
PAG | scan    3456 |              | stall      0  |              |              |              |              | swin       0  |              | swout      0 |
MDD |          md2 | busy      0% | read       1  | write  15442 | KiB/r      4 | KiB/w      4 | MBr/s   0.00 | MBw/s  60.32  | avq     0.00 | avio 0.00 ms |
DSK |          sda | busy    107% | read       1  | write    205 | KiB/r      4 | KiB/w    506 | MBr/s   0.00 | MBw/s 101.33  | avq    93.88 | avio 4.51 ms |
DSK |          sdb | busy     92% | read       0  | write    191 | KiB/r      0 | KiB/w    511 | MBr/s   0.00 | MBw/s  95.50  | avq    86.84 | avio 4.20 ms |
NET | transport    | tcpi       2 | tcpo       2  | udpi       0 | udpo       0 | tcpao      0 | tcppo      0 | tcprs      0  | tcpie      0 | udpip      0 |
NET | network      | ipi        3 | ipo        2  | ipfrw      0 | deliv      2 |              |              |               | icmpi      0 | icmpo      0 |
NET | eth1      7% | pcki   40678 | pcko       0  | si   71 Mbps | so    0 Kbps | coll       0 | erri       0 | erro       0  | drpi       0 | drpo       0 |
NET | eth0      0% | pcki       3 | pcko       2  | si    2 Kbps | so   14 Kbps | coll       0 | erri       0 | erro       0  | drpi       0 | drpo       0 |
  PID                        RDDSK                        WRDSK                       WCANCL                        DSK                       CMD         1/1
 1123                           0K                        9892K                           0K                       100%                       dumpcap
  991                           4K                          36K                           0K                         0%                       flush-9:2
   18                           0K                           4K                           0K                         0%                       sync_supers
27471                           0K                           0K                           0K                         0%                       nautilus
 2842                           0K                           0K                           0K                         0%                       nautilus
 1133                           0K                           0K                           0K                         0%                       atop
 1128                           0K                           0K                           0K                         0%                       dumpcap
 2960                           0K                           0K                           0K                         0%                       wireshark
   38                           0K                           0K                           0K                         0%                       kswapd0
 1870                           0K                           0K                           0K                         0%                       kondemand/0

Attachment: packet-loss-io-graph.JPG
Description: JPEG image