Hello Graham,
>>> Graham Bloice <graham.bloice@xxxxxxxxxxxxx> 01/24/08 9:06 AM >>>
> Excuse my little englander ignorance, but is the problem is occurring
> because the files have characters from outside the 7 bit ASCII character
> set? If this is correct we should add a suitable entry to README.developer.
>
> It would be nice if there was an automated way of checking this (apart
> from using the MS compiler) for all committed files.
Many moons ago in a multi-developer based project I was forced
to setup some Makefile steps to abort compilation if any of the *.[ch]
files contained anything but a specific subset of ASCII characters.
In our case we to restrict our source code modules to allow ONLY the
"printable" ASCII characters (0x20-9x7E) and a very small subset of
ASCII control characters. I think we restricted the control characters
to just ASCII CR (0x0d), and ASCII LF (0x0a) characters (this was an
MS-DOS based project). In our environment we specifically wanted
to prohibit the programmers from inserting any TAB (ASCII HT (0x09))
or ESC (ASCII ESC (0x1b)) characters into the source code files.
In our case some of the programmers had resorted to copying and
pasting HP PCL sequences directly into the source code in any attempt
to make the source "print out better". ;-)
At other times the source files would get mangled by careless editing
and/or file transfers that would result in all sort of "unprintable"
characters within the source. ASCII NUL (0x00) characters in the
source were particularly hard for the programmers to spot!
At the time I wrote a simple tool that was invoked during the build
process that would simply scan the input file for any forbidden
characters. If none was found a simple "<filename>.ok" file was
generated. Another dependency in the make required that all of the
source files have a the "*.ok" files. Success of this step resulted in a
tiny .obj file that the main build/link step required.
If a forbidden character was found in the input file, a message was
written to stderr that an "illegal character was found in input file
FILENAME at offset X". The tool would then exit with a return code
that ultimately caused make processing to halt.
I suspect that with some bash scripting and the use of some of the
standard cli tools (sed?) that something similar could be put together.
Perhaps something like this be worth pursuing? (Or perhaps a simple
compiler flag exists for accomplishing same!)
Jim Young