On Jun 28, 2011, at 10:27 AM, Guy Harris wrote:
> We have an issue regarding strings in packets in general. Strings might be in a number of encodings, including ASCII (meaning that any byte with the 8th bit set is something that shouldn't be there), other national variants of ISO 646, UTF-8, UTF-16, UCS-2 (meaning "only the Basic Multilingual plane, with no surrogate pairs"), ISO 8859/x for various values of x, various ISO 2022-based encodings (e.g., the EUC encodings), various national standards, various DOS and Windows code pages, various Mac OS encodings, EBCDIC, whatever encodings are used for SMS, etc., etc., etc, etc.:
>
> http://en.wikipedia.org/wiki/Template:Character_encoding
As long as I'm piling up a ton of information about humanity's twisty little maze of character encodings, all different:
SMS:
https://secure.wikimedia.org/wikipedia/en/wiki/GSM_03.38