Wireshark-users: Re: [Wireshark-users] Pcap files

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Fri, 16 Oct 2009 18:23:45 -0700

On Oct 16, 2009, at 6:10 PM, Rayne wrote:

I noticed that every pcap file, even the empty ones without any packets, contain a 24-byte "header" at the beginning of the file. At least 3 of the bytes vary from file to file, and the rest appears to be the same, at least from the files I've seen. If I were to omit these 24 bytes from the file, Wireshark doesn't recognize the file as a pcap anymore.

So I guess these 24 bytes are to indicate that the file is of libpcap format, but does anyone know what these 24 bytes are in details, i.e. what they represent?

On a machine with an OS that includes libpcap 1.0.0 or later (OS X Snow Leopard, in this case, although recent versions of some Linux distributions might also have it now, and recent versions of some *BSDs might as well)

$ man 5 pcap-savefile
PCAP-SAVEFILE(5) PCAP- SAVEFILE(5)



NAME
       pcap-savefile - libpcap savefile format

DESCRIPTION
NOTE: applications and libraries should, if possible, use libpcap to read savefiles, rather than having their own code to read savefiles. If, in the future, a new file format is supported by libpcap, applica- tions and libraries using libpcap to read savefiles will be able to read the new format of savefiles, but applications and libraries using their own code to read savefiles will have to be changed to support the
       new file format.

``Savefiles'' read and written by libpcap and applications using libp- cap start with a per-file header. The format of the per- file header
       is:

              +------------------------------+
              |        Magic number          |
              +--------------+---------------+
              |Major version | Minor version |
              +--------------+---------------+
              |      Time zone offset        |
              +------------------------------+
              |     Time stamp accuracy      |
              +------------------------------+
              |       Snapshot length        |
              +------------------------------+
              |   Link-layer header type     |
              +------------------------------+
All fields in the per-file header are in the byte order of the host writing the file. The first field in the per-file header is a 4-byte magic number, with the value 0xa1b2c3d4. The magic number, when read by a host with the same byte order as the host that wrote the file, will have the value 0xa1b2c3d4, and, when read by a host with the oppo- site byte order as the host that wrote the file, will have the value 0xd4c3b2a1. That allows software reading the file to determine whether the byte order of the host that wrote the file is the same as the byte order of the host on which the file is being read, and thus whether the values in the per-file and per-packet headers need to be byte- swapped.

       Following this are:

A 2-byte file format major version number; the current version
              number is 2.

A 2-byte file format minor version number; the current version
              number is 4.

              A 4-byte time zone offset; this is always 0.

A 4-byte number giving the accuracy of time stamps in the file;
              this is always 0.

A 4-byte number giving the "snapshot length" of the capture; packets longer than the snapshot length are truncated to the snapshot length, so that, if the snapshot length is N, only the first N bytes of a packet longer than N bytes will be saved in
              the capture.

a 4-byte number giving the link-layer header type for packets in the capture; see pcap-linktype(7) for the LINKTYPE_ values that
              can appear in this field.

Following the per-file header are zero or more packets; each packet begins with a per-packet header, which is immediately followed by the
       raw packet data.  The format of the per-packet header is:

              +---------------------------------------+
              |      Time stamp, seconds value        |
              +---------------------------------------+
              |    Time stamp, microseconds value     |
              +---------------------------------------+
              |    Length of captured packet data     |
              +---------------------------------------+
              |Un-truncated length of the packet data |
              +---------------------------------------+
All fields in the per-packet header are in the byte order of the host writing the file. The per-packet header begins with a time stamp giv- ing the approximate time the packet was captured; the time stamp con- sists of a 4-byte value, giving the time in seconds since January 1, 1970, 00:00:00 UTC, followed by a 4-byte value, giving the time in microseconds since that second. Following that are a 4-byte value giv- ing the number of bytes of captured data that follow the per-packet header and a 4-byte value giving the number of bytes that would have been present had the packet not been truncated by the snapshot length. The two lengths will be equal if the number of bytes of packet data are
       less than or equal to the snapshot length.

SEE ALSO
       pcap(3PCAP), pcap-linktype(7)

(Note: before writing code to read this, PLEASE pay attention to the first paragraph on the man page. Unless you have a compelling reason to read this file format yourself, rather than letting some existing code read it, and unless you either promise not to complain if tools start writing pcap-NG files or are willing to update your code to read pcap-NG files at some point, leave it up to libpcap/WinPcap or Wireshark's Wiretap library to do the reading. If you're reading in a program written in Perl/Python/Ruby/Java/some .Net language/etc., see

	http://en.wikipedia.org/wiki/Pcap#Wrappers_for_use_of_libpcap.2FWinPcap_in_languages_other_than_C_and_C.2B.2B

and

	http://en.wikipedia.org/wiki/Pcap#External_links

for information on wrappers for libpcap/WinPcap for your favorite language.)