Wireshark-dev: Re: [Wireshark-dev] Why tvb_get_bits() assumes Big Endian?

From: Tomasz Moń <desowin@xxxxxxxxx>
Date: Thu, 30 Jul 2020 20:09:34 +0200
On Thu, Jul 30, 2020 at 6:22 PM Filipe Laíns <lains@xxxxxxxxxxxxx> wrote:
> I think it would be reasonable to say that cases which need such
> special handling can do it themselves. Wireshark should still provide
> all the routines required to make this work, but I don't think it needs
> to necessarily provide a function for very specific use cases.

I would consider the special case to be swapping/reversing the bits
within a byte. Which is different from the issue discussed here.

> With that said, if we want to indeed support this then we need new
> functions. We can't touch the current signature as that will break the
> ABI.

tvb_get_bits() functions do take the encoding parameter already, but
simply ignore it. If someone calls it with ENC_LITTLE_ENDIAN while
expecting it to work like with ENC_BIG_ENDIAN, then I would say that's
a bug. Everyone who indeed wanted the little endian handling would
notice that it doesn't really do the job.

> My proposal would be to make tvb_get_bits* respect little endian for
> the value itself and add correspondent tvb_get_network_bits* methods
> that would treat the buffer as little endian. What do you think?

tvb_get_network_bits to treat buffer as little endian seems weird,
because the network byte order is big endian.

> Would adjusting the behavior of tvb_get_bits* be considered a breaking
> change? I mean, this behavior is not documented.

I wouldn't consider finally handling the ENC_LITTLE_ENDIAN in
tvb_get_bits() as a breaking change. Such code path used to be
DISSECTOR_ASSERT_NOT_REACHED() after all. It might just be a bit late
to implement, but it wouldn't be the first thing that finally got
implemented after many years.

Header fields with bitmasks are already working just fine with
ENC_LITTLE_ENDIAN on byte aligned data. Implementing ENC_LITTLE_ENDIAN
in tvb_get_bits() would just allow doing essentially the same for
unaligned data.