Wireshark-dev: Re: [Wireshark-dev] question on validation of a dissected string from a BASE_CUS

From: John Thacker <johnthacker@xxxxxxxxx>
Date: Mon, 18 Sep 2023 11:11:34 -0400

On Sun, Sep 17, 2023, 10:06 PM Guy Harris <gharris@xxxxxxxxx> wrote:
On Sep 7, 2023, at 9:15 AM, John Dill <John.Dill@xxxxxxxxxxxxxxxxx> wrote:


If so, perhaps what's called for is a new mechanism to provide private *encodings*, so that the dissectors registration routine might do something such as

        guint32 enc_frequency;

                ...

        enc_frequency = register_integer_encoding_uint64(my_frequency_routine);

where "encode_frequency() would take a tvbuff and an offset (and possibly other arguments as necessary) and provides a 64-bit unsigned integer.  Then you could do

        ti = proto_tree_add_item(word_tree, hf_XTS_5000_APX_8000_Receive_Frequency, tvb, offset, 6, enc_frequency);

with the registration routine returning an encoding value guaranteed not to collide with any predefined encodings for that type or with any other registered encoding.
...

A custom encoding of the proposed sort, and a custom formatter, could both be used; the custom decoder routine could either itself add an expert info item, or could include both a decoding routine and a checking routine to add expert info.

This could be combined with a custom *display* routine.

A custom encoding would have the advantage of working with filtering, which custom formatting of numbers does not.

For a field the result of the encoding is the value, but the result of the formatting is just a string to display.

For filtering, a string can be checked against a value string's outputs, as there's usually a limited number of possibilities. (It takes the first match and doesn't work on range strings - what would be nice would be to somehow convert that into matching against a set of values that yield that string.) It doesn't test all possible 32 or 64 bit values to see which ones might yield the string you want to filter with.

If you have something which is a derived floating point number, it's almost surely better to compute the number, add it as a generated FT_DOUBLE field or whatever (proto_tree_add_double()) or perhaps a FT_UINT measuring KHz (since floating point comparisons are tricky https://gitlab.com/wireshark/wireshark/-/issues/16483) checking for illegal values and adding expert infos then, and then perhaps have custom formatting for that. Then you could filter with a decimal number instead of having to filter by typing in your BCD encoding. (Also note that generating a filter for your current field will produce a decimal version of the still BCD encoded value, which won't be easy to read.)

John Thacker