Wireshark-dev: Re: [Wireshark-dev] wireshark-0.99.7 compiling error, The file contains a charac

From: "Jim Young" <SYSJHY@xxxxxxxxxxxxxxx>
Date: Thu, 24 Jan 2008 11:10:17 -0500
Hello Graham,

>>> Graham Bloice <graham.bloice@xxxxxxxxxxxxx> 01/24/08 9:06 AM >>>
> Excuse my little englander ignorance, but is the problem is occurring 
> because the files have characters from outside the 7 bit ASCII character 
> set?  If this is correct we should add a suitable entry to README.developer.
> 
> It would be nice if there was an automated way of checking this (apart 
> from using the MS compiler) for all committed files.

Many moons ago in a multi-developer based project I was forced 
to setup some Makefile steps to abort compilation if any of the *.[ch] 
files contained anything but a specific subset of ASCII characters.   

In our case we to restrict our source code modules to allow ONLY the 
"printable" ASCII characters (0x20-9x7E) and a very small subset of 
ASCII control characters.  I think we restricted the control characters 
to just ASCII CR (0x0d), and ASCII LF (0x0a) characters (this was an 
MS-DOS based project).  In our environment we specifically wanted 
to prohibit the programmers from inserting any TAB (ASCII HT (0x09)) 
or ESC (ASCII ESC (0x1b)) characters into the source code files.

In our case some of the programmers had resorted to copying and 
pasting HP PCL sequences directly into the source code in any attempt 
to make the source "print out better". ;-)   

At other times the source files would get mangled by careless editing 
and/or file transfers that would result in all sort of "unprintable" 
characters within the source.   ASCII NUL (0x00) characters in the 
source were particularly hard for the programmers to spot!

At the time I wrote a simple tool that was invoked during the build 
process that would simply scan the input file for any forbidden 
characters.   If none was found a simple "<filename>.ok" file was 
generated.  Another dependency in the make required that all of the 
source files have a the "*.ok" files.  Success of this step resulted in a 
tiny .obj file that the main build/link step required.

If a forbidden character was found in the input file, a message was 
written to stderr that an "illegal character was found in input file 
FILENAME at offset X".  The tool would then exit with a return code 
that ultimately caused make processing to halt.

I suspect that with some bash scripting and the use of some of the 
standard cli tools (sed?) that something similar could be put together.

Perhaps something like this be worth pursuing? (Or perhaps a simple
compiler flag exists for accomplishing same!)

Jim Young