Martijn Schipper wrote:
I have created a dissector for a protocol and one of the fields is UTF-8 
encoded. What should I do to display this field in the tree?
If you mean "what should I do to display all the characters in it 
correctly", the answer is "change Ethereal's handling of strings to 
allow a character encoding to be specified with the string, and add 
UTF-8 as one of the valid encodings".  (With such a change, the set of 
encodings should ultimately include:
	ASCII, meaning "display anything with the 8th bit set, as well as all 
control characters, as an escaped character";
	UTF-8;
	16-bit Unicode (big-endian and little-endian);
	various PC OEM character sets;
	various classic Mac OS character sets (OS X's native encoding is UTF-8, 
but the earlier versions might've used MacRoman, etc.);
	EBCDIC;
	ISO 8859/x;
	various EUC character sets;
	various other encodings (KOI-8, Shift-JIS, 
GBwhatever-that-Chinese-encoding-is, etc.).
Note that iconv isn't necessarily the answer, as we can't guarantee that 
the iconv implementation on a given platform will support all the 
character sets that Ethereal would need (it's not a question of what 
character sets the machine running Ethereal uses, because it has to deal 
with the character sets that the machines that transmitted the packets 
Ethereal is reading used).  Perhaps incorporating a copy of GNU iconv 
into Ethereal, and having our own tables for character encodings, would 
be the answer.
Note also that to display, print, etc. these characters you have to deal 
with:
	GTK+ 1.2[.x], which expects text in whatever the encoding is for the 
font being used;
	GTK+ 1.3[.x] and 2.x, which expect UTF-8 text;
	formatting to a text file, which, on UN*X, should probably generate 
text in whatever the encoding is for the user's local, and on Windows, 
should probably - what?  ASCII?  16-bit Unicode?  If 16-bit Unicode, how 
can it tag the file as such, so that Windows text editors can handle it? 
 Begin the file with a byte-orde mark?
	printing to a printer.
If, however, you are willing to live with only ASCII characters being 
displayed correctly, then, if you're adding the strings as fields, 
Ethereal should properly escape non-ASCII characters, and if you're 
explicitly formatting with "proto_tree_add_text()" or 
"proto_tree_add_XXX_format()", use "format_text()" or 
"tvb_format_text()" with "%s" format items (which is what people should 
be doing *anyway*, to keep non-printable characters from screwing things 
up).