Guy Harris wrote:
On Oct 25, 2010, at 1:19 PM, Jeff Morriss wrote:
I noticed this weekend that there's a bunch of non-ASCII characters in
the manuf file.
Non-UTF-8, or non-ASCII? Non-UTF-8 won't work well in the GUI, or in TShark output if your locale doesn't use the encoding in question as the text encoding; UTF-8 *should* work in the GUI with GTK+ 2.x and later (if not, I'd argue there's a bug somewhere), but might not work in TShark if your locale doesn't use UTF-8 as the text encoding (perhaps we should make TShark, at least on UN*X, convert -T text output to the locale's text encoding using iconv).
The sample capture I was using is
https://bugs.wireshark.org/bugzilla/attachment.cgi?id=5350
The manuf entry for that is (escaped):
00:50:C2:0A:20:00/36 J\xC3\xAF\xC2\xBF\xC2\xBDgerCom
And, actually, Wireshark displays it pretty much like Unidecode() does:
Ji?1/2gerC
I guess there is no problem after all... (Well, unless there IS some
non-UTF8 in there.)