Ethereal-dev: Re: [ethereal-dev] Re: Packet Sniffer Package
Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.
From: Guy Harris <gharris@xxxxxxxxxxxx>
Date: Fri, 3 Mar 2000 22:34:06 -0800
> > Does anyone know if libpcap under Linux uses the new, improved capture > > routines automagically, or simply uses the 'lame' interface ...? > > The standard libpcap under Linux uses the oldest, lamest interface - > SOCK_PACKET sockets with an address/protocol family of AF_INET/PF_INET. > > A new one being done will use, on the Linux 2.2 and later kernels, the > better mechanism that the 2.2 kernel added, namely SOCK_RAW sockets > with an address/protocol family of AF_PACKET/PF_PACKET). I forget > whether the guy working on that checked it into the tcpdump.org CVS tree > yet or not. Torsten Landschoff is the person who's working on the new libpcap Linux module; he's checked it into a branch of the tcpdump.org CVS tree for libpcap, but it's a side branch - it's neither in the main branch, nor the branch for an 0.5 release, yet. I don't know whether it's intended to go into an 0.5 release (or when an 0.5 release will come out). > I don't know whether that's the 1-copy mechanism to which you're > referring, though. >From looking at the code path from the Intel EEPro100 driver to the code that dispatches received packets, it looks as if, for a packet that is handed only to libpcap, or is handed to regular protocols as well as libpcap but isn't modified by those protocols, the only copy that should be involved is the copy to userland. However, I think that should happen regardless of which interface is used, even the oldest, lamest one, at least on 2.2 (I don't have 2.0[.x] kernel source handy right now). In any case, the particular lameness that seemed to be discussed in the mail thread you forwarded is the lack of a way of finding out how many packets were dropped; that's not an issue of the number of copies (except that too many copies might, in some situations, eat enough CPU to cause packets to be dropped when they wouldn't have been dropped without the copy). It looks as if there's a global counter, "netdev_rx_dropped", of all packets dropped in "netif_rx()", which appears to be the routine called by device drivers to hand an incoming frame to the network stack. The only way I see of fetching that is something that I presume is a "/proc" entry; if so, a read from it returns various statistics, including "netdev_rx_dropped". This won't tell you how many of the packets a particular libpcap stream would have seen got dropped - they get dropped before even being chosen to be handed to a particular stream. The mechanism for handing raw packets up to userland is a socket; as one might expect, the socket has a receive high water mark, and stuff gets tossed if the socket's receive buffer is full and something comes in (which would happen if the application using libpcap can't read stuff fast enough). It appears that no count is kept of packets discarded because the socket buffer is too full. Torsten's code doesn't provide any packet-drop count - and I'm not sure it can reliably report such a count, as there doesn't appear to be a count of packets dropped at the socket layer. > Alexey Kuznetzov has a patch to add a mechanism > that, as I understand it, lets the kernel and the application share a > memory-mapped region, so that incoming packets don't have to get copied > up to userland; It doesn't look as if this eliminates the copy. What it appears to do is provide a chunk of wired-down memory shared between the kernel and userland, and copies incoming packets to that area. This does let it keep track of packet drops due to the shared area being full of packets not yet processed by the userland code. Packet drops in "netif_rx()" would have to be counted by getting the value of "netdev_rx_dropped" at the start of the capture and at the end of the capture, and adding the difference to the count of packet drops due to "buffer full". However, that would count packets dropped on interfaces other than the one on which you're listening, if the machine has more than one interface - those drops are due to the system being so busy that even the kernel code that handles packets queued up at interrupt level and processed later can't handle them, and I don't know whether that's a common occurrence, but if somebody wants to know about every single packet dropped, that's a problem. Of course, I don't know whether on a *really* busy system - too busy to even drain the device's ring buffer in the interrupt handler - packets dropped because they arrive when the device's ring buffer is filled are counted, or if the device even tells you how many devices are dropped due to that, so perhaps nobody does a *perfect* job. The BPF mechanism in the BSDs has its own buffer, and the link-layer driver hands all incoming packets to BPF, which can keep track of every packet that gets dropped on a particular BPF device, as the only reason (other than "the device dropped it because the ring buffer is full") why a packet is dropped on a BPF device is "the BPF device's buffer was full", so, whilst it may not handle the "device ring buffer full" case (although *if* the device reports how often that happened, a mechanism could conceivably be provide to let the device bump the drop count - no such mechanism exists, however), it does at least handle all other drops. (Alexey's patch also lets you pick up the time stamp for the packet without making a second system call; the socket-based stuff requires you to do an SIOCGSTAMP "ioctl" to get the time stamp.) Obviously, said patch is a kernel patch, so libpcap cannot, by itself, fix that problem. > he also has patches to the old libpcap that use that > mechanism if present, and otherwise use the 2.2-and-later mechanism if > present, It does appear to do that... > otherwise, I think, fall back on the old 2.0 mechanism. ...but it doesn't do that (i.e., it's 2.2-and-later only). > In addition, he says that some such mechanism was checked into the 2.3 > kernel at some point. There is such a mechanism; it looks similar, and does involve a single copy. (None of these mechanisms implement the timeout mechanism that Ethereal requires - and the lack of which we work around, on Linux, with a "select()" - but, as the way you block waiting for a packet to arrive with the shared-memory mechanism is you do a "poll()", I think they *could* implement it as a timeout on the "poll()".) The BPF mechanism in the BSDs currently requires two copies - the mbuf chain for the incoming packet is copied to an internal buffer, and the stuff from that buffer is copied up to userland on a read. A shared-memory mechanism similar to the Linux ones could be implemented, I suspect.
- Follow-Ups:
- Re: [ethereal-dev] Re: Packet Sniffer Package
- From: Guy Harris
- Re: [ethereal-dev] Re: Packet Sniffer Package
- References:
- [ethereal-dev] Re: Packet Sniffer Package
- From: Richard Sharpe
- Re: [ethereal-dev] Re: Packet Sniffer Package
- From: Guy Harris
- [ethereal-dev] Re: Packet Sniffer Package
- Prev by Date: Re: [ethereal-dev] Re: Packet Sniffer Package
- Next by Date: Re: [ethereal-dev] Re: Packet Sniffer Package
- Previous by thread: Re: [ethereal-dev] Re: Packet Sniffer Package
- Next by thread: Re: [ethereal-dev] Re: Packet Sniffer Package
- Index(es):