Ethereal-dev: Re: [Ethereal-dev] enhancing xml pdml output format

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: "Guy Harris" <gharris@xxxxxxxxx>
Date: Wed, 8 Feb 2006 14:31:20 -0800 (PST)
Martin d'Anjou wrote:

> I am trying to parse the PDML XML file to extract a list of TCP options.
> Below is a copy of what I captured. I have edited to make it narrower but
> all relevant fields to this enhancement request are still there. So here
> is an excerpt of my capture:
> I would like to be able to parse this and print out the list of TCP
> options. But parsing it is unnecessarily complicated by the fact that
> everything is called "field",

That's the way PDML is *DEFINED* to work:

    http://analyzer.polito.it/30alpha/docs/dissectors/PDMLSpec.htm

If you don't like it, complain to the developers of PDML at the
Politecnico di Torino.

Note also that there is *NOT* a fixed set of field names, so using field
names as XML tags probably won't work very well.

> and that the attributes do not consistently
> identify the option type. As you can see, sometimes the option name is in
> the name attribute, sometimes in the show attribute, sometimes in the
> showname attribute.

Not everything displayed in the protocol tree is a named field; that means
we either have to

    1) violate the PDML spec and have <field> items lacking a "name"
attribute

or

    2) omit all fields that aren't named, such as some of the TCP options.

The name attribute is the field name, which might happen to correspond to
the option name, or might not.  It's intended for use in filter
expressions, so it's probably not going to have spaces or anything else
that gets in the way of parsing filter expressions.

The showname attribute is the "human readable" name of the field; any
field with "name" should also have a "showname" attribute (and, in fact,
in the examples you show, that is the case), so it's not that "sometimes
the option name is in the name attribute ... [and] sometimes in the
showname attribute".  It will be a name intended for people to read, but
*NOT* at all designed for parseability - if it happens to be parseable,
OK, but that's not the intent.

It appears to be a bug that "showname" is showing the *value* of those
options.  That's what "show" is for; the PDML spec says that the "show"
field "keeps the field value in a "printable" form".

It *also* appears to be a bug that "show" is showing a "human readable"
name for the field; it's only supposed to show the field value, in a
human-readable form.

If you want to make the PDML generated by Ethereal parseable by code that
wants to generate a list of TCP options, you should submit a patch to make
*all* the TCP options named fields, and then have your parser look for
<field> items with a "name" attribute of the form
"tcp.options.{something}" and extract the "value" attribute.

Unfortunately, if you didn't edit the value attributes out when you
"edited [the PDML output] to make it narrower", it appears that there's
*another* bug - not all those <field> items have <value> attributes.  If
you *did* edit them out, then "all relevant fields to this enhancement
request" *weren't* still there.