URL: https://code.wireshark.org/review/gitweb?p=wireshark.git;a=commit;h=4355b4901edda1b1f73db754852f8b601e22f0f3
Submitter: "Peter Wu <peter@xxxxxxxxxxxxx>"
Changed: branch: master-3.0
Repository: wireshark
Commits:
4355b49 by Peter Wu (peter@xxxxxxxxxxxxx):
Fix crash when using the "matches" operator on non-UTF-8 data
GRegex is a thin wrapper around PCRE. Inputs (patterns and subjects) are
assumed to be UTF-8 by default (unless G_REGEX_RAW is set). If the
subject is not valid UTF-8, normally pcre_exec will immediately return a
failure. However, as GLib sets PCRE_NO_UTF8_CHECK when G_REGEX_RAW is
given, pcre_exec() will skip the safety check and crash instead.
Fix this by always assuming raw byte patterns. Regression risk: patterns
such as `ö.ï` will no longer match `öñï` since `ñ` is a multi-byte
sequence. Patterns such as `(GET|POST) /` remain functional though.
Bug: 14905
Change-Id: I6450bb83f565d377f82a5dbb01690c5f49acd96f
Reviewed-on: https://code.wireshark.org/review/31935
Petri-Dish: Peter Wu <peter@xxxxxxxxxxxxx>
Tested-by: Petri Dish Buildbot
Reviewed-by: Anders Broman <a.broman58@xxxxxxxxx>
(cherry picked from commit 0ca65a66f425c8beaa1af3deb3b84c2b16cffb55)
Reviewed-on: https://code.wireshark.org/review/31965
Reviewed-by: Peter Wu <peter@xxxxxxxxxxxxx>
Actions performed:
from 4ed8189 CMake: clear cache variables when a library has changed
add 4355b49 Fix crash when using the "matches" operator on non-UTF-8 data
Summary of changes:
epan/ftypes/ftype-bytes.c | 24 ------------------------
epan/ftypes/ftype-pcre.c | 38 ++++++++++----------------------------
2 files changed, 10 insertions(+), 52 deletions(-)