Wireshark-dev: Re: [Wireshark-dev] proto.h extension

From: Evan Huus <eapache@xxxxxxxxx>
Date: Fri, 8 May 2015 15:13:22 -0400
On Fri, May 8, 2015 at 3:09 PM, Guy Harris <guy@xxxxxxxxxxxx> wrote:
>
> On May 8, 2015, at 7:06 AM, "John Dill" <John.Dill@xxxxxxxxxxxxxxxxx> wrote:
>
>>> Message: 3
>>> Date: Thu, 7 May 2015 11:29:22 -0700
>>> From: Guy Harris <guy@xxxxxxxxxxxx>
>>> To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
>>> Subject: Re: [Wireshark-dev] proto.h extension
>>> Message-ID: <CF2E7490-F023-49C2-8383-6C1A1394B4E7@xxxxxxxxxxxx>
>>> Content-Type: text/plain; charset=iso-8859-1
>>>
>>> On May 7, 2015, at 8:13 AM, "John Dill" <John.Dill@xxxxxxxxxxxxxxxxx> wrote:
>>>
>>>> I have a couple of extensions that I created for the Wireshark baseline
>>> that we're using (1.10.x).  The diffs to proto.h and proto.c show the code
>>> changes to add a couple of features that I've found useful, unit strings
>>> and hiding the bits for bitmask header fields.
>>>>
>>>> http://codepad.org/KTGdEL1t
>>>
>>> You need more than
>>>
>>>      /* Following constants have to be ORed with a base_display_e when dissector
>>>       * want to control aspects of the display string for a header_field_info */
>>>
>>> as a comment there - you need comments explaining what each of those flags actually *does*.
>>
>> Guy,
>>
>> Sorry it wasn't clear.  Starting from this snippet...
>>
>> /*
>> * BASE_UNIT_STRING - Append a unit string to the numeric value
>
> That one's reasonable; I've thought of a similar option.
>
>> *
>> * When active, Wireshark will append a unit string declared as a
>> * simple 'char *' for the 'strings' to the numeric value.
>
> You might want to, instead, have it be a structure with a pair of strings, so that if the field has the value 1, you can print the singular rather than the plural, e.g.:
>
> struct unit_names {
>         char *singular; /* name to use for 1 unit */
>         char *plural;   /* name to use for < 1 or > 1 units */
> };
>
> struct unit_names feet {
>         "foot",
>         "feet"
> };
>
> {
>  &hf_MFD_State_Data_Slew_Elevation_Ch1,
>  {
>    "Slew_Elevation_Ch1",
>    "ndo.MFD_State_Data.Slew_Elevation_Ch1",
>    FT_FLOAT,
>    BASE_NONE | BASE_UNIT_STRING,
>    &feet,
>    0x0,
>    NULL,
>    HFILL
>  }
> },
>
> (For floating-point numbers, "1 unit" means "*exactly* 1 unit", i.e. an exact floating-point comparison with 1x2^0.)
>
> We could either
>
>         1) require that both be non-null
>
> or
>
>         2) assume that, if the plural is null, you can pluralize using the standard rules of English.
>
> Does anybody have a preference there?
>
> As the last word of the last item in the list above indicates, that limits the output to English; however, the same applies to, for example
>
>     { &hf_ip_hdr_len,
>       { "Header Length", "ip.hdr_len", FT_UINT8, BASE_DEC,
>         NULL, 0x0F, NULL, HFILL }},
>
> and I don't expect to see the full names of every single field in Wireshark:
>
>         $ ./tshark -G fields | wc -l
>           151035
>
> to be translated to any other language any time soon, and it also applies to
>
> static const value_string ipopt_timestamp_flag_vals[] = {
>     {IPOPT_TS_TSONLY,    "Time stamps only"                      },
>     {IPOPT_TS_TSANDADDR, "Time stamp and address"                },
>     {IPOPT_TS_PRESPEC,   "Time stamps for prespecified addresses"},
>     {0,                  NULL                                    }};
>
> and translating all value_string tables would also be difficult.
>
> So should we think about localization of the packet summary and detail fields (which would, I suspect, be a huge project), and possibly leave the unit strings open for localization, or not?
>
>> /*
>> * BASE_NO_BITMASK_DISPLAY - Hide the bitfield display of a data
>> *                           element.
>> *
>> * In Wireshark, any time the 'bitmask' argument is non-zero, a
>> * bitfield display showing the location of 1's and 0's is
>> * display is shown.  In certain situations, the display of
>> * bitmask fields next to non-bitmask fields is quite jarring
>> * because the display of these bitfields do not align the
>> * data elements in an easy to scan column.
>> *
>> * 1010 .... Data Element 1
>> * .... 0101 Data Element 2
>> * Data Element 3
>
> Presumably that would be used by people who don't care enough about the individual bits of the octet(s) to be willing to sacrifice that for having the elements line up.
>
> That might depend on the protocol, with protocols doing more bit-packing being more likely to have a mix of octet-aligned and non-octet-aligned fields.  Presumably avionics has some low-bit-rate links and needs to do more bit-packing; I think some telephony protocols have the same issue - that might be why ASN.1 PER was created.
>
> So is this something that should be done on a per-field basis, as a preference, or both?

It would be simpler and smarter if we just accounted for the fact when
aligning fields: if a field does not have a bitmask, but something
above/below it (in the same tree) does then indent it with the
appropriate number of spaces.