Wireshark-commits: [Wireshark-commits] master f3b6316: Use a strong hash function for ethernet addr

From: Wireshark code review <code-review-do-not-reply@xxxxxxxxxxxxx>
Date: Wed, 7 May 2014 09:00:31 +0000 (UTC)
URL: https://code.wireshark.org/review/gitweb?p=wireshark.git;a=commit;h=f3b631668bc35db2acac25d9c537f02d6c143e15
Submitter: Anders Broman (a.broman58@xxxxxxxxx)
Changed: branch: master
Repository: wireshark

Commits:

f3b6316 by Evan Huus (eapache@xxxxxxxxx):

    Use a strong hash function for ethernet addresses.
    
    The capture for bug 10078 caused the buildbot to time out; callgrind revealed an
    enourmous amount of time being spent looking up ethernet addresses. The previous
    code cast each address (6 bytes) to a guint64 (8 bytes) then used the built-in
    g_int64_hash. Unfortunately, g_int64_hash is an *awful* hash function - it
    produces a 4-byte hash by simply discarding the upper 4 bytes of its input.
    
    For the capture file in question this strategy (which effectively ignores the
    upper two bytes of each ethernet address) produced an astounding number of
    collisions, leading to the terrible running-time.
    
    Use wmem_strong_hash directly on the 6-byte address instead, which saves us a
    bunch of useless casting and bit-twiddling and produces a much better hash
    distribution. This shaves 20% off the time to tshark-with-tree the capture file
    in question *despite* a substantially more expensive hash function
    (wmem_strong_hash is not exactly fast compared to g_int64_hash).
    
    Bug:10078
    Change-Id: I8e81cbc478e6394ec3a8efe39eec08f680a55609
    Reviewed-on: https://code.wireshark.org/review/1543
    Reviewed-by: Anders Broman <a.broman58@xxxxxxxxx>
    

Actions performed:

    from  b07195a   Fix a typo.
    adds  f3b6316   Use a strong hash function for ethernet addresses.


Summary of changes:
 epan/addr_resolv.c |  155 ++++++++--------------------------------------------
 1 file changed, 23 insertions(+), 132 deletions(-)