Ethereal-dev: Re: [Ethereal-dev] RE: New dissector for CIGI

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: "Guy Harris" <gharris@xxxxxxxxx>
Date: Tue, 6 Dec 2005 14:05:13 -0800 (PST)
Harms, Kyle J wrote:
> As far as Unicode goes.  I think it might be worthwhile to consider some
> of the character set translation utilities.  Such as
> http://www.cl.cam.ac.uk/~mgk25/download/transtab.tar.gz, maybe there are
> better ones out there.  Maybe on platforms that do not support Unicode

It's not the platform as a whole that supports, or doesn't support, Unicode.

It's components of the platform.

There's the GUI toolkit; GTK+ 2.x supports it (all strings are UTF-8),
GTK+ 1.2[.x] doesn't (it uses whatever the encoding is for the font or
font set being used).

There's the text file format supported by various text-processing
utilities.  On UN*X systems, if the text-processing utility handles
non-ASCII characters at all, it probably assumes that the character set
(and, implicitly, the encoding; Unicode would be encoded as UTF-8, not
UTF-16 or UTF-32, in text files) is specified by the LANG environment
variable, so the character set could well differ from user session to user
session.

There's the printers and print software.

So it's not as if it's a simple Boolean question.

> a preprocessor could be run on all "" strings that would translate double
> arrow to => and mu to u, etc.

That treats it as a Boolean issue.

(In any case, a Unicode double-arrow isn't so much nicer than => that it's
worth using the double-arrow in any case, and "usec" is pretty much good
enough for "microseconds"; that also means you don't have to worry about,
for example, whether the recipient of a mail message in which you're
including dissected text has a mail reader that can handle Unicode and has
fonts on their system that can handle it.)

> Obviously there should be a way to handle
> this, I just don't have enough experience to know.  How do we currently
> handle the AUTHORS file since it is Unicode?

We only display it if we have GTK+ 2.x or later.