Guy Harris wrote:
...or we find all the non-ASCII characters in the files (or, at least,
the ones that cause problems; I don't know whether MSVC has problems
with comments) and get rid of them.
I've checked in changes for most of the files in his message (most of
which were gratuitious non-ASCII characters in comments that could be
replaced with ASCII equivalents, e.g. just using " rather than fancy
curly quotes); the only exception is packet-e212.c, which has country
names in French with accented letters.
I suppose one way to handle that would be to have the E.212 dissector
read the country code list from a text file in UTF-8 form (we already
have one UTF-8 file read by Wireshark, namely the AUTHORS-SHORT file).
Yes, that means there's a file in the source tree that's not all-ASCII
and that thus
1) isn't going to be displayed correctly on all systems
and
2) can't necessarily be edited correctly on all systems
but at least it's not a source file - the only reason to edit it would
be to add, remove, or change a country code, which is probably not going
to be done as often, or by as many people, as would source code changes,
and it doesn't have to pass through a C compiler that might not
correctly interpret the characters.