Le 19/06/2012 21:01, Jakub Zawadzki a �crit :
> Hi,
>
> String from tvb_get_ephemeral_string() still needs escaping with format_text(),
> cause it doesn't check encoding.
>
> When you use:
> tvb_get_ephemeral_string_enc(tvb, offset, length, ENC_UTF_8 | ENC_NA);
>
> It guarantees result encoded in UTF-8:
> * string as converted from the appropriate encoding to UTF-8 ...
>
> (Code to do it is still in XXX's but this is bug in libwireshark and no one can blame you that you used wrong function :))
Hi,
thanks for the hint (and for adding proto_tree_add_unicode_string :) ).
Still I probably miss something but when looking at the code for
tvb_get_ephemeral_string_enc, I see:
case ENC_ASCII:
default:
/*
* For now, we treat bogus values as meaning
* "ASCII" rather than reporting an error,
* for the benefit of old dissectors written
* when the last argument to proto_tree_add_item()
* was a gboolean for the byte order, not an
* encoding value, and passed non-zero values
* other than TRUE to mean "little-endian".
*
* XXX - should map all octets with the 8th bit
* not set to a "substitute" UTF-8 character.
*/
strbuf = tvb_get_ephemeral_string(tvb, offset, length);
break;
case ENC_UTF_8:
/*
* XXX - should map all invalid UTF-8 sequences
* to a "substitute" UTF-8 character.
*/
strbuf = tvb_get_ephemeral_string(tvb, offset, length);
break;
Do you mean we should already start using tvb_get_ephemeral_string_enc
to continue working once the check for the ASCII 8th bit will be in place?
Regards,
Pascal.