John Vincent wrote:
I notice that the decode from GSM MAP SMS text does not work
correctly when the encoding scheme is UCS2 (it works OK for the GSM
7-bit alphabet. Would anybody be intersted in fixing that? To fix it
I guess ethereal needs support for displaying unicode fonts (unless
it already has that...?)
A limited capability to handle it could be provided simply by ignoring
(or displaying as "\XNNNN") non-ASCII characters.
Displaying Unicode characters is not hard with GTK+ 2.x - GTK+ 2.x
expects to be handed a UTF-8 string. It's harder with GTK+ 1.2[.x] -
I'm not even sure how to find out the encoding GTK+ 1.2[.x] expects for
strings.
(The native Windows GUI code Gerald's been working on off and on would,
I think, require something such as the Microsoft Layer for Unicode:
http://www.microsoft.com/globaldev/handson/dev/mslu_announce.mspx
in order to support W95/W98/WMe. Were we to do a native KDE version, I
think the version of Qt that KDE 3.x uses handles Unicode natively -
either as UCS-2 or UTF-8; OS X also uses UTF-8 encoded Unicode in the
GUI code.
This also raises issues of file access - recent versions of GLib have
stdio wrapper routines that presumably either do nothing or translate
from UTF-8 to the locale's character set on UN*X, and map to Unicode on
Windows.)
The main thing Ethereal needs is a more sophisticated scheme for
handling strings, as there are a number of different character encodings
that can be used for strings (UCS-2, both little-endian as in SMB and
probably big-endian in some places; UTF-8; ISO 8859/n for various values
of n; assorted other EUC character encodings, e.g. DBCS encodings for
various Asian languages; various non-EUC character encodings, including
DOS/Windows code pages, old Mac character sets, Shift-JIS, KOI-8, etc.,
etc., etc. - oh, and don't forget EBCDIC, if, as I think is the case,
some SNA protocols we dissect use it).
This might also call for us incorporating our own version of iconv, or
something equivalent, as the Single UNIX Specification item on iconv
says that the actual encoding names are implementation-dependent, but we
need a *platform-independent* way for a dissector to specify that a
string is, say, MacRoman or ISO 8859/1 and for the Ethereal core to
translate from that to UTF-8.
See the first item under "Dissector infrastructure" in
http://wiki.ethereal.com/Development_2fWishlist
Even if ethereal can't display the message correctly, it would be
nice if it didn't say [Malformed Packet: GSM SMS] which it does at
the moment.
I suspect that's a separate problem.