Ethereal-dev: Re: [Ethereal-dev] on-the-fly unzipping of compressed captures?

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: "Guy Harris" <gharris@xxxxxxxxx>
Date: Thu, 18 Aug 2005 18:24:12 -0700 (PDT)
Ulf Lamping wrote:
> Jeff Morriss wrote:
>> I noticed that with 0.10.12 if I load a compressed (e.g., gzip'd) file
>> the file size in the progress window grows as the file is read in.  I
>> guess it's being uncompressed on the fly now?

Nothing in the way the file is decompressed has changed.

The decompression is done with libz's support for reading a compressed
stream.  That means that it's done "on the fly" in the sense that Ethereal
does *NOT* decompress the entire file into some temporary location and
read from that decompressed file.

> Explanation is simple.
>
> If you look at the file sizes in the Status row you may have noticed
> that both values represent number of bytes (KB/MB). The first one should
> be the currently processed value, the second one should be the "end
> value" (100%). You may load an uncompressed file and have a look at how
> it should be (it's working correct).
>
> But with compressed files both values are permanently increasing.

The problem is that "f_len" is expected by the progress bar code to be the
size, in bytes, of the file.  "size" here means "how big does the
operating system think it is", so if a 5000-byte file compressed to 2700
bytes, "size" is 2700 bytes.  (The progress bar code works with raw
offsets in the compressed byte stream in the file, and with the raw size
of the file, as those can be determined beforehand rather than only after
the entire file's been decompressed - the latter would mean we'd have no
idea how much progress we've made until we're done.)

We now use, in some places, the number of bytes of uncompressed data in
the file.  That's computed as we read (and decompress) the file.

That's OK - but, unfortunately, that's stored in "f_len", so, as we're
reading the file, "f_len" represents the number of *uncompressed* bytes of
file data we've read *so far*.

I checked in a fix, with "f_datalen" being the number of bytes of
uncompressed data we've read from the file so far (so it's not the file's
size until we're done reading the file), and "f_len" being the total size
of the file according to "stat()".