Wireshark-bugs: [Wireshark-bugs] [Bug 10214] ASAN: global-buffer-overflow in _ws_mempbrk_sse42

Date: Tue, 24 Jun 2014 11:39:06 +0000

Comment # 8 on bug 10214 from
After a second thought, the garbage copy does not matter. First, it looks for a
terminating NUL. Anything thereafter gets ignored.

Here are some results for attempts to optimize it. n=1e9, a haystack of 16
bytes ('x') and a needle of 1 byte ('x'). CPU cycles count (rdtsc) is used as
indicator (times 1e9 for total time, that was actually measured). First number
is a normal build, second number is with ASAN enabled, third number is normal
build but a needle of 8 bytes.

18.1 (22.8; 18.1) - current SSE implementation
32.3 (45.0; 44.2) - test and copy without memcpy (aliasing the mask as char *,
skipping _mm_load_si128)
33.2 (47.2; 51.5) - test and copy without memcpy
37.2 (66.0; 42.8) - loop with ptr
38.5 (69.0; 50.1) - loop through needles
45.0 (90.0; 43.3) - naive implementation with strlen (weird, a longer string is
faster each time?!)
46.3 (100.; 47.3) - memchr to find needles length, substract ptrs for length
50.9 (74.6; 65.3) - _ws_strpbrk

---

I'll make an ASAN build use one of these changes rather than disabling
everything. The performance advantage is still something to consider even after
adding the ASAN quirk.

---

The following replaces the alignment checks branches of needles:
#define FALLBACK return _ws_mempbrk(s, slen, a)
// strlen:
length = strlen(a);
if (length > 16)
    FALLBACK;

// memchr:
char *p = memchr(a, '\0', 16);
if (p == NULL)
    FALLBACK;
length = p - a;

// loop through needles:
length = 0;
for (i = 0; i < 16; i++) {
    if (a[i] == '\0') {
        length = i;
        break;
    }
}
if (length == 0)
    FALLBACK;

// loop through ptrs (assumes small needle)
const char *p = a;
while (*p)
    p++;
length = p - a;
if (length > 16)
    FALLBACK;

// the replacements code follows up with:
__m128i a128 = _mm_setzero_si128();
memcpy(&a128, a, length);
mask = _mm_load_si128 (&a128);


// test and copy (without memcpy):

      char tmp[8] = { '\0' };
      int i;
      for (i = 0; a[i] && i < 16 && a[i]; i++)
          tmp[i] = a[i];

      /* larger than 16B */
      if (tmp[15] != '\0')
        return _ws_mempbrk(s, slen, a);

      mask = _mm_load_si128 ((__m128i *) (void *) tmp);


You are receiving this mail because:
  • You are watching all bug changes.