Wireshark-bugs: [Wireshark-bugs] [Bug 10681] UTF8 replacement characters in FT_STRINGs are escap

Date: Tue, 11 Nov 2014 21:38:44 +0000

changed bug 10681


What Removed Added
Status UNCONFIRMED CONFIRMED
CC   [email protected]
Summary HTTP URI is read from the Content-Type field UTF8 replacement characters in FT_STRINGs are escaped for presentation
Ever confirmed   1

Comment # 1 on bug 10681 from
Actually the complete request_uri is:

\357\277\275[&0\357\277\275{\357\277\275\357\277\275R\357\277\275\357\277\275\357\277\275,\357\277\275\357\277\275k\357\277\275\357\277\275\357\277

Note the repeating pattern of \357\277\275 (0xEFBFBD).

That's not data from the packet, that's the UTF-8 replacement character.

Changing this line in packet-http.c to use ENC_UTF_8 instead of ENC_ASCII:

        /* Save the request URI for various later uses */
        request_uri = tvb_get_string_enc(wmem_packet_scope(), tvb, offset,
tokenlen, ENC_UTF_8);

fixes it but I don't think that's the right fix (is HTTP supposed to be ASCII
or UTF8?).  I'd guess the fix is the formatting routines need to realize this
string has already been encoded into UTF8 and display it as such (I thought the
UI was UTF8-ready).


You are receiving this mail because:
  • You are watching all bug changes.