Hi,
if I have a tvbuff that starts with 0x86 and I call
a = tvb_get_string_enc(tvb, 0, ENC_ASCII)
proto_tree_add_string(..., a);
I can trigger the DISSECTOR_ASSERT since a is not a valid unicode string.
Comments in the code suggest that tvb_get_string() should replace
chars>=0x80 with the unicode replacement char, which is two bytes long.
This would look like
guint8 *
tvb_get_string(wmem_allocator_t *scope, tvbuff_t *tvb, gint offset, gint length)
{
wmem_strbuf_t *str;
tvb_ensure_bytes_exist(tvb, offset, length);
str = wmem_strbuf_new(scope, "");
while (length > 0) {
guint8 ch = tvb_get_guint8(tvb, offset);
if (ch < 0x80)
wmem_strbuf_append_c(str, ch);
else {
wmem_strbuf_append_unichar(str, UNREPL);
}
offset++;
length--;
}
wmem_strbuf_append_c(str, '\0');
return (guint8 *) wmem_strbuf_get_str(str);
}
The resulting string would still contain len+1 chars but not necessarily
len+1 bytes. Would that be a problem, i.e. is it ok to do sth like
b = tvb_get_string(NULL, tvb, offset, len_b);
copy_of_b = g_malloc(len_b+1);
memcpy(copy_of_b, b, len_b+1);
?
If that should work, we'd need a separate function for get string &
replace 8bit chars.
Thoughts?
Martin