Wireshark-commits: [Wireshark-commits] master ed0741f: fix-encoding-args.pl: fix terrible performan

From: Wireshark code review <code-review-do-not-reply@xxxxxxxxxxxxx>
Date: Sat, 22 Sep 2018 15:44:27 +0000
URL: https://code.wireshark.org/review/gitweb?p=wireshark.git;a=commit;h=ed0741ffbd87d113033e0b1d9a4ac80c93a1f1e7
Submitter: "Anders Broman <a.broman58@xxxxxxxxx>"
Changed: branch: master
Repository: wireshark

Commits:

ed0741f by Peter Wu (peter@xxxxxxxxxxxxx):

    fix-encoding-args.pl: fix terrible performance with large files
    
    "fix-encoding-args.pl epan/dissectors/packet-ieee80211.c" used to take
    over 12 seconds to complete. After this change it is reduced to 400ms.
    Profiling with Devel::NYTProf showed two issues:
    - find_hf_array_entries (5 seconds): matching leading whitespace
      triggers a candidate match against every line. Fix this by removing
      whitespace prior to matching.
    - fix_encoding_args_by_hf_type (7.5 seconds): executing 2131 different
      substitution patterns is slow. Fix this by grouping field names and
      execute the substitution only once afterwards (in total 6 calls).
    
    packet-rrc.c is by far the largest file with 215k lines, this used to
    take forever (321s) and now completes in 1.3s.
    
    Regression tested by removing "ENC_ASCII" and "ENC_UTF_8" in
    dissect_venue_name_info, the expected warnings are still visible.
    
    Change-Id: I071038e8fcb56474ac41223568ce6724258c059d
    Reviewed-on: https://code.wireshark.org/review/29789
    Petri-Dish: Peter Wu <peter@xxxxxxxxxxxxx>
    Tested-by: Petri Dish Buildbot
    Reviewed-by: Anders Broman <a.broman58@xxxxxxxxx>
    

Actions performed:

    from  557649f   TFTP: Use a GByteArray.
     add  ed0741f   fix-encoding-args.pl: fix terrible performance with large files


Summary of changes:
 tools/fix-encoding-args.pl | 31 +++++++++++++++++++------------
 1 file changed, 19 insertions(+), 12 deletions(-)