On Jan 20, 2014, at 1:49 PM, Martin Kaiser <lists@xxxxxxxxx> wrote:
> I committed the change to tvb_get_string() in r54864.
I've changed that *not* to map bytes with the 8th bit set to REPLACEMENT CHARACTER for UTF-8 strings. For UTF-8 strings, we need to do a more complicated check and map invalid octet sequences to REPLACEMENT CHARACTER. (We also need to do some more stuff for UCS-2, UTF-16, and UCS-4.)
tvb_get_string() still treats the string as ASCII.
> I'll have a look at tvb_get_stringz() tomorrow.
I've added that (with the same change *not* to do it for UTF-8 strings). tvb_get_stringz() treats the string as ASCII.