Wireshark-dev: Re: [Wireshark-dev] possible memory error in the SnifferDecompress function?

From: Lewis Burns <lewisurn@xxxxxxxxx>
Date: Tue, 09 Sep 2014 17:25:26 -0700
Hi Guy,

Thanks for the tip. Another run with valgrind after using -g with gcc shows the line number. They point to the places that you suspected. I'll file a bug for both invalid write and memory overlap.

==17888== Memcheck, a memory error detector
==17888== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==17888== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==17888== Command: ./a.out
==17888==
==17888== Source and destination overlap in memcpy(0x51cc8a7, 0x51cc82b, 146)
==17888==    at 0x4C2A690: memcpy (mc_replace_strmem.c:838)
==17888==    by 0x40098D: SnifferDecompress (ngsniffer_noklee.c:187)
==17888==    by 0x400B38: main (ngsniffer_noklee.c:250)
==17888==
==17888== Source and destination overlap in memcpy(0x51d752d, 0x51d7522, 13)
==17888==    at 0x4C2A690: memcpy (mc_replace_strmem.c:838)
==17888==    by 0x400A34: SnifferDecompress (ngsniffer_noklee.c:216)
==17888==    by 0x400B38: main (ngsniffer_noklee.c:250)
==17888==

Cheers,
Lewis

On 09/09/2014 05:09 PM, Guy Harris wrote:
On Sep 9, 2014, at 3:11 PM, Lewis Burns <lewisurn@xxxxxxxxx> wrote:

We've recently done some unit testing on open source projects. One of issues we've found is related to the SnifferDecompress function in the wiretap/ngsniffer.c file. We're unable to determine that the memory issues shown by valgrind can actually appear in the program due to our unfamiliarity with the code base. I'm sending in a small testcase to the list and hoping that some developers can validate or invalidate that this is a bug in the code.
There's one place where the code isn't doing bounds checking, which probably accounts for

==5795== Invalid write of size 1
==5795==    at 0x400798: SnifferDecompress (in /home/chaoqiang/workspace/se/klee/exp/a.out)
==5795==    by 0x400B6F: main (in /home/chaoqiang/workspace/se/klee/exp/a.out)
==5795==  Address 0x521d080 is 0 bytes after a block of size 65,536 alloc'd
==5795==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5795==    by 0x400AE8: main (in /home/chaoqiang/workspace/se/klee/exp/a.out)
as, when I tried reproducing this, with a version of the test program built with -g, it reported

==55119== Invalid write of size 1
==55119==    at 0x100001618: SnifferDecompress (ngsniffer_noklee.c:90)
==55119==    by 0x100001E32: main (ngsniffer_noklee.c:250)
==55119==  Address 0x10003c150 is 0 bytes after a block of size 65,536 alloc'd
==55119==    at 0xC658: malloc (vg_replace_malloc.c:295)
==55119==    by 0x100001D83: main (ngsniffer_noklee.c:241)

which is the "*(pout++) = *(pin++);" in

		/* Use the bits in bit_value to see what's encoded and what is raw data */
		if ( !(bit_mask & bit_value) )
		{
			/* bit not set - raw byte we just copy */
			*(pout++) = *(pin++);
		}

which, unlike other code copying to the output buffer, doesn't do a bounds check.

I couldn't reproduce the overlap complaint, but perhaps valgrind doesn't replace memcpy() on OS X or perhaps its replacement memcpy() doesn't check for overlaps.  I *suspect* the overlap problems come from

			case 2  :   /* LZ77 long strings */
				/*
				  Low 4 bits of offset to string is the low nybble of the
				  first code byte, upper 8 bits of offset is in
				  the next byte.
				  Length of string immediately follows.
				  Total code size: 3 bytes.
				*/
				offset = code_low + ((unsigned int)(*pin++) << 4) + 3;
				/* If we are already at end of input, there is no byte
				   to repeat */
				if ( pin >= pin_end )
				{
					*err = WTAP_ERR_UNC_TRUNCATED;	 /* data was oddly truncated */
					return ( -1 );
				}
				/* Check if offset would put us back past begin of buffer */
				if ( pout - offset < outbuf )
				{
					*err = WTAP_ERR_UNC_BAD_OFFSET;
					return ( -1 );
				}

				/* get length from next byte, make sure it won't overrun buf */
				length = (unsigned int)(*pin++) + 16;
				if ( pout + length > pout_end )
				{
					*err = WTAP_ERR_UNC_OVERFLOW;
					return ( -1 );
				}

				/* Copy the string from previous text to output position,
				   advance output pointer */
				memcpy( pout, pout - offset, length );
				pout += length;
				break;
			default :   /* (3 to 15): LZ77 short strings */
				/*
				  Low 4 bits of offset to string is the low nybble of the
				  first code byte, upper 8 bits of offset is in
				  the next byte.
				  Length of string to repeat is overloaded into code_type.
				  Total code size: 2 bytes.
				*/
				offset = code_low + ((unsigned int)(*pin++) << 4) + 3;
				/* Check if offset would put us back past begin of buffer */
				if ( pout - offset < outbuf )
				{
					*err = WTAP_ERR_UNC_BAD_OFFSET;
					return ( -1 );
				}

				/* get length from code_type, make sure it won't overrun buf */
				length = code_type;
				if ( pout + length > pout_end )
				{
					*err = WTAP_ERR_UNC_OVERFLOW;
					return ( -1 );
				}

				/* Copy the string from previous text to output position,
				   advance output pointer */
				memcpy( pout, pout - offset, length );
				pout += length;
				break;

where it's *supposed* to be copying a string from earlier in the decompressed output (thus being before the current position in the output, and thus not overlapping), but isn't checking to make sure that the source is fully within the already-produced portion of the output.

I'll check in fixes for both of those problems.

Thanks.

Steps to repeat the issue:
 gcc -DRANDOM ngsniffer_noklee.c
As per the above

	gcc -g -DRANDOM ngsniffer_noklee.c

may produce better debugging output from valgrind, with line numbers.
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe