Wireshark-dev: Re: [Wireshark-dev] Note about proto_tree_add_unicode_string (r43379)

From: Pascal Quantin <pascal.quantin@xxxxxxxxx>
Date: Tue, 19 Jun 2012 21:15:12 +0200
Le 19/06/2012 21:14, Pascal Quantin a �crit :
> Le 19/06/2012 21:01, Jakub Zawadzki a �crit :
>> Hi,
>>
>> String from tvb_get_ephemeral_string() still needs escaping with format_text(),
>> cause it doesn't check encoding.
>>
>> When you use:
>>   tvb_get_ephemeral_string_enc(tvb, offset, length, ENC_UTF_8 | ENC_NA);
>>
>> It guarantees result encoded in UTF-8:
>>  * string as converted from the appropriate encoding to UTF-8 ...
>>
>> (Code to do it is still in XXX's but this is bug in libwireshark and no one can blame you that you used wrong function :))
> Hi,
>
> thanks for the hint (and for adding proto_tree_add_unicode_string :) ).
> Still I probably miss something but when looking at the code for
> tvb_get_ephemeral_string_enc, I see:
>     case ENC_ASCII:
>     default:
>         /*
>          * For now, we treat bogus values as meaning
>          * "ASCII" rather than reporting an error,
>          * for the benefit of old dissectors written
>          * when the last argument to proto_tree_add_item()
>          * was a gboolean for the byte order, not an
>          * encoding value, and passed non-zero values
>          * other than TRUE to mean "little-endian".
>          *
>          * XXX - should map all octets with the 8th bit
>          * not set to a "substitute" UTF-8 character.
>          */
>         strbuf = tvb_get_ephemeral_string(tvb, offset, length);
>         break;
>
>     case ENC_UTF_8:
>         /*
>          * XXX - should map all invalid UTF-8 sequences
>          * to a "substitute" UTF-8 character.
>          */
>         strbuf = tvb_get_ephemeral_string(tvb, offset, length);
>         break;
>
> Do you mean we should already start using tvb_get_ephemeral_string_enc
> to continue working once the check for the ASCII 8th bit will be in place?
>
> Regards,
> Pascal.
Forget about it, I just saw your sentence in parenthesis :)

Regards,
Pascal.