Wireshark-dev: Re: [Wireshark-dev] can't compile wireshark version 4.0

From: Guy Harris <gharris@xxxxxxxxx>
Date: Thu, 20 Oct 2022 16:31:15 -0700
On Oct 20, 2022, at 4:02 PM, Guy Harris <gharris@xxxxxxxxx> wrote:

> Definitely signed, unless I'm missing something.

Please hand me the Sad Old Man With Fading Memory prize of the year.

*Nothing guarantees that a char is signed*.  This is still true as of C18:

	An object declared as type char is large enough to store any member of the basic execution character set. If a member of the basic execution character set is stored in a char object, its value is guaranteed to be nonnegative. If any other character is stored in a char object, the resulting value is implemen- tation-defined but shall be within the range of values that can be represented in that type.

This dates, as I remember, back to some instruction sets having a sign-extending byte load instruction and others having a non-sign-extending byte load instruction.

The "signed" keyword was added to C to allow code that expects a char-sized signed integer value type to say "signed char"; for generality, you can say "signed short int" or "signed int", or "signed long int" or..., but that's redundant.

So, given that this is the result of

> [compiling] wireshark version 4.0 on Raspberry Pi ubuntu 22.04

perhaps ARM32 or ARM64, whichever this is, is a target with *unsigned* char variables, so that "signed char" is an 8-bit signed integral type and both "char" and "unsigned char" are 8-bit unsigned integral types.

And, at least according to this StackOverflow item:

	https://stackoverflow.com/questions/2054939/is-char-signed-or-unsigned-by-default

on *most* platforms char is unsigned on ARM - "iOS", presumably meaning "all Apple operating systems" at this point, is different (sort of makes sense for macOS, to avoid throwing in Another Portability Problem when adding Apple silicon support, and maybe, given that a ton of code from the kernel on up to frameworks in iOS came from macOS, they did that to simplify porting to iOS.  I think Apple silicon macOS is the first target for which we've done main-branch builds, so the "sorry, char is unsigned" problem didn't show up for us.

So code should not assume that char is signed or that it's unsigned.

This might call for a test such as

	#if CHAR_MIN == 0
		do "char is unsigned" stuff
	#else
		do "char is signed" stuff
	#endif

e.g.

	#if CHAR_MIN == 0
		#define CHAR_VALUE_IS_NEGATIVE(c)	(0)
	#else
		#define CHAR_VALUE_IS_NEGATIVE(c)	((c) < 0)
	#endif

	if ((CHAR_VALUE_IS_NEGAIVE(ba[i]) || ba[i] >= ' ') && ba[i] != (char)0x7f && !g_ascii_isprint(ba[i])) {

and hope that the compiler doesn't warn that we're doing "0 || x", because, in *that* case, we'd need to do something more complicated.