Wireshark-dev: Re: [Wireshark-dev] proto.h extension (unit strings)

From: John Dill <John.Dill@xxxxxxxxxxxxxxxxx>
Date: Mon, 12 Dec 2016 16:18:38 +0000
>Message: 1
>Date: Sun, 11 Dec 2016 22:17:11 -0500
>From: Michael Mann <mmann78@xxxxxxxxxxxx>
>To: wireshark-dev@xxxxxxxxxxxxx
>Subject: Re: [Wireshark-dev] proto.h extension (unit strings)
>Message-ID: <158f108bd63-2eb5-53f0@xxxxxxxxxxxxxxxxxxxxxxx>
>Content-Type: text/plain; charset="utf-8"
>
>
>I thought this was a good idea, just took a while to get around to it:
>https://code.wireshark.org/review/19211

Hi Michael,

First off, thanks for spending time on a feature that at least I for one would thoroughly abuse ;-).  One initial comment that I can think of is whether "BASE" is an appropriate term for specifying a unit string, since a unit string is typically independent of the base used to represent the number.  I typically think of BASE when using BASE_DEC or BASE_HEX, etc.  In the patch I maintain, I ended up renaming it to OPTION_UNIT_STRING because there was confusion with the field_display_e enumeration in header field definitions.  Might be something to consider.

You could consider using a value_string for multiple representations could be useful for {"second", "seconds", "s"}.  That way you can associate a standardized name for each component.  I don't handle multiple unit string versions, but this is the route I would try first if I was going to try to implement it.

static const value_string header_field_units[] =
{
  { UNIT_SINGULAR, "foot" },
  { UNIT_PLURAL, "feet" },
  { UNIT_ABBREVIATION, "ft" },
  { 0, NULL }
};

The only difficulty is communicating when to use the abbreviation vs full name for a given header field or on a broad scale.  I'm not sure about what other people's needs would be, but I think a global (or possibly but less likely a dissector) specific preference seems feasible in my opinion.

I'm not a fan of manually adding " " to some unit strings and leaving it out for the abbreviation to the definition, i.e. I wouldn't want to have to do this:

static const value_string header_field_units[] =
{
  { UNIT_SINGULAR, " foot" },
  { UNIT_PLURAL, " feet" },
  { UNIT_ABBREVIATION, "ft" },
  { 0, NULL }
};

To that end I think the strings should not be specified with spaces at all.  It's pretty obvious to put a space for whole words, less so for abbreviations.  Could do a survey of a few text books with units to see what the majority style is for English.  Of course all hats are off for foreign translations.  If you choose to make the standard rendering use abbreviations without leading spaces in the dissection, then one can simply insert a space into the abbreviation string to get the effect they want.  Maybe not the idyllic solution, but certainly feasible.  Alternatively, it may be better to include a couple of global unit string checkbox preferences to include the leading string or not.  I'm not familiar with enough foreign languages to know how best to address it.

I don't know whether I'd recommend maintaining a list of unit string structures in Wireshark.  I mean, it would certainly be nice for common SI units, but I don't know about Imperial units, and then do you maintain these same unit strings in different languages as well?  Seems like it'd be difficult to maintain.  Unfortunately, I don't have any ideas on how to address it, but I don't think it too much to ask for people to create their own value_string if they really want units for a header field.

On a related note (and the reason I'm pushing using a value_string), using a value_string may be the best way to add optional parameters in the future as well, where I could foresee wanting to do this (at least for the stuff that I do).

static const value_string header_field_options[] =
{
  { UNIT_SINGULAR, " foot" },
  { UNIT_PLURAL, " feet" },
  { UNIT_ABBREVIATION, "ft" },
  { SCALE_FACTOR, "0.125" },
  { 0, NULL }
};

If there's future design space to include a scale factor, this may be one method to incorporate it without messing with the header_field_info structure too badly.  Then, you could add "| OPTION_SCALE_FACTOR" if you needed to convert units (I do radians to degrees, and semicircles to lat/long, and lots of data that are in scaled integer form all the time, and it's pretty messy).  Unfortunately, I've not made much headway into applying a scale factor usefully.  I can kind of get the display working, but I've no clue how to handle the filtering, e.g. if I convert the display to degrees, but the raw data is in radians, I can't figure out how to change the filtering mechanism that would allow me to filter a header field in degrees.

Just some ideas to ponder.

Best regards,
John Dill

P.S.  A couple of times my emails didn't show up on the wireshark-dev list, so I cc'd you as well.  I think it's a plain-text/html problem with my email client.