Joe Elliott wrote:
If you have a dual processor machine you can spread the
load across both cpu by doing: (tcpdump used in example)
# tcpdump -i $ifName -w - -s $snapLen $filterCode | tcpdump -r -w $file
This binds 1 CPU to do the expensive kernel to user space copy and 1
processor to do the decode/write to disk.
Presumably by "decode" you mean "copying" - "tcpdump -w" does no packet
decoding whatsoever.
Also, on what OSes does writing to a pipe and reading from that pipe not
involve a user-to-kernel copy for the write and a kernel-to-user copy
for the read? In
tcpdump -i $ifName -w $file -s $snapLen $filterCode
I see one kernel-to-user copy from the packet capture mechanism (unless
it's using some memory-mapped mechanism) and one user-to-kernel copy for
writing the file (unless the buffers are page-aligned and the write is
done by page-flipping), while in the pipeline I see an additional
kernel-to-user and user-to-kernel copy in the second process.
Perhaps what it's doing is running the capture effort and the file
system writing effort on separate CPUs, which, on an MP server, might
get you enough more parallelism (and perhaps enough less latency, which
might be what really matters here) to more than compensate for the extra
copy. (Your mileage may vary significantly on a multi-threaded processor.)
(In that case, it might be interesting to see whether a multi-threaded
capture program - which might be simpler than tcpdump, as all it'd do
would be capture packets and write them - would do better, by avoiding
the extra copies for the pipe. I don't know whether having the two
processors' caches both accessing the data would make a difference,
although the same issue might also come up for the pipe data, depending
on how clever the kernel is about that.)
Finally look at some of the ring buffer techniques for libpcap that are
becoming more popular. This is the final step. PF_RING etc.
Yes, at least some of the problem might be with Linux PF_PACKET sockets
and the socket code, as per some of Luca Deri's papers:
http://luca.ntop.org/
and, in particular:
http://luca.ntop.org/Ring.pdf
which is the paper describing PF_RING, and which notes that, at least in
his tests, PF_PACKET sockets dropped a *LOT* more packets than FreeBSD's
BPF, which dropped a *LOT* more packets than WinPcap (on the same hardware).