Wireshark-dev: Re: [Wireshark-dev] Some questions about the "option block" interface in libwire

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Fri, 20 May 2016 16:28:34 -0700
On May 18, 2016, at 5:05 PM, Anthony Coddington <anthony.coddington@xxxxxxxxxx> wrote:

> Having worked with the wtap_optionblock generic option block system beyond refactoring existing code, I have some thoughts on the current design that may be relevant. I have numbered paragraphs from 10 so we don't get confused.

So perhaps we need to have namespaces for paragraph numbers, such as

guy@xxxxxxxxxxxx:3) What mechanisms are available for handling block/record types, or options, not currently supported by pcapng, but that might be provided by other file types? ...

	...

anthony.coddington@xxxxxxxxxx:10) Generic versus filetype-specific block types: I think the current implementation is too specific to PCAP-NG blocks. ...

:-)

> Michael Mann <mmann78@xxxxxxxxxxxx> wrote:
>> On May 15, 2016, at 6:40 PM, Guy Harris <guy@xxxxxxxxxxxx> wrote:
>>> 3) What mechanisms are available for handling block/record types, or options, not currently supported by pcapng, but that might be provided by other file types? Hadriel Kaplan suggested getting a Private Enterprise Number (PEN) for wireshark.org, and using custom blocks and options for this purpose; have we gotten a PEN for wireshark.org yet?
>> The current interface may need another layer of refactoring but the idea was that you would have a block and just call wtap_optionblock_add_option to add your options.  For "defined but not yet used" options of the standard blocks that turn into "now used" options, the wtap_optionblock_add_option would be done within the creation of that standard block.  The creation of "standard blocks" and their option types is where we may need another layer of registration and refactoring. Copying/reading/writing on any/all blocks is supposed to happen within the "option block" functionality (where default behavior would be "inclusive" in copying blocks).  I wasn't sure where "merging" would go and if that would be part of the "option block" functionality (at least for "provided" blocks/options) or left to external tools (like mergecap and editcap).
> 
> 10) Generic versus filetype-specific block types: I think the current implementation is too specific to PCAP-NG blocks. This makes sense for highly specialized blocks but common metadata that can be (and are) supported across formats used by Wireshark like interface information (IDB/ISB), capture information (SHB) and potentially things like name resolution blocks, should be as agnostic as possible. The same block type could then be extended with new option types for format-specific information. If that is via a Wireshark PEN and custom option that is fine, but it shouldn't be too cumbersome to use compared to native PCAP-NG options for simple types. In particular I don't think the mandatory struct makes much sense in most cases. Other formats should not be required to provide any options not directly required by Wireshark, regardless of whether they are part of a fixed header in PCAP-NG. Examples of this could include IDB snap_len and SHB section_length.

Your examples are fixed fields in pcapng, so presumably we're talking about making them options in the "generic" layer libwiretap.

If so, we'd probably give them default values appropriate for pcapng files, such as WTAP_MAX_PACKET_SIZE for the snapshot length and 0xFFFFFFFFFFFFFFFFF for the section length.

For the snapshot length, the pcap file writer would use the maximum value from all the interfaces as the value to write in the file header, and the pcapng file writer would use the value of the option for the fixed field.

However, that brings up a mismatch between pcap and pcapng - what if you have a pcapng file with:

	SHB
	IDB with a snaplen of 128
	packet
	IDB with a snaplen of 4096
	packet

A program such as Wireshark, which reads the entire file in before it writes any files out, and thus has seen all the IDBs and can figure out whether:

	1) all the interfaces have the same link-layer header type (if not, the file can't be saved as pcap);

	2) what the maximum snapshot length for all the interfaces is;

and use that.

A single-pass program such as editcap, however, can't do that.

If it later discovers that not all the interfaces have the same link-layer type, it needs to report an error, as there's nothing it can do.

If it discovers that the snapshot lengths are different, then:

	if it's writing to an uncompressed file, it can seek back, update the snapshot length with the new value, and then go back to EOF and continue writing;

	if it's writing to a compressed file or to something that's not a file (such as a pipe or socket), it'd have to report an error.

A bit ugly, and the rules would be a bit complicated to explain to the user, but it may be the best we can do.

(The reason why getting the snapshot length right is important is that a code reading a file might use the snapshot length to determine how big a buffer to allocate for packet data.  The code probably *should* be prepared to handle packets bigger than the snapshot length, but not all code out there *is* prepared for it.

The snapshot length is also useful as a record of whether the capture was "sliced" and, if so, where it was sliced; this would lose per-interface information about that, but writing a pcapng file out as a pcap file loses other per-interface information, such as the interface name, as well as an indication of which packets arrived on which interfaces.  If you don't want to lose that information, don't write the data out as a pcap file; if you have to hand it to a program that can only read pcap files, keep the original file around if you need to keep that information.)

> Ideally Wireshark-wide timestamp formats would also be used or at least included.
> 
> 11) Self-writing/serializing block types: For the same reason I really don't think self-writing of PCAP-NG blocks belongs in wtap_opttypes.c (as relatively recently added in https://code.wireshark.org/review/#/c/14357/). I think the optionblock should act more like a well-known list of generic and filetype specific options that fit into that metadata category. Perhaps formats could register a write function each for the block type, but I personally don't see much benefit in this. The format might as well treat it similarly to foreign encap types in its dump function, where the wiretap determines how the information should be converted or written out for that format.

Yes, wtap_opttypes.c combines "generic" code and code to write pcapng blocks and option, so the code to handle pcapng is split between that file and pcapng.c.  Code to *read* them is still all in pcapng.c, however.

I think it might be best not to have the "generic" code know anything about the pcapng file format.

> 12) Multiple options: As Guy Harris has mentioned, it should be possible to add multiple options of a given type such as (but not limited to!) multiple comments. Personally I think a list-based approach works well here (possibly also indexed by a hash table or something like a GSequence) as it also preserves ordering which may be important for some formats. Perhaps an append() vs add() (or set()) distinction could be made, for replacing an existing value rather than adding another. On the display side, multiple comments for a packet may even come from multiple blocks.

Yes.  A set of options should be able to have zero or more instances of a given option - yes, zero.  Perhaps for those options for which there's a default value, such as the if_tsresol option in the IDB, a set of options should start out with a value explicitly flagged as a default (so that the pcapng option writing code knows that it doesn't need to write the option out), but for other options, the set should *NOT* be populated with an instance of the option unless the file specifies it explicitly.

For some options, multiple instances don't make sense; the if_speed and if_tsresol options for the IDB are examples of these.  The pcapng spec should indicate which options are in that category (I think Jasper Bongertz has mentioned that either here or in the pcapng mailing list) and should:

	say that writers MUST not put multiple instances of those options into a block;

	either say what readers MUST/SHOULD/MAY do if they see multiple instances of those options, or indicate that the behavior of a reader is undefined in this case.

My inclination would be to, if an option of that sort is seen and there's already an instance in the option set:

	if the instance is marked as a default value, replace the default value;

	otherwise, ignore the later instance.

That would require that the pcapng spec not explicitly require that the later instances of an option take precedence over earlier instances in the same block.

I'm working on changes to separate the notion of "the set of option *types* that can be used in a given type of block" and "the set of option *values* for a given instance of a block type".  A set of option values can have multiple instances of an option, if the item in the set of option types for that option type says it can.  An option value can be flagged as a default value; an option type can be flagged as having a default value, and when an option set is created, it is initialized to contain values for those option types that have default values.

> 13) Checking for and merging options: There seems to be some conflation between adding options to blocks and registering "standard" option types to block types.

Yes.

Those are two completely separate operations, and must be implemented as such.

> There doesn't appear to be a way to determine whether an option was already present in the record or simply has a default value, as all options are 'added' up front.

Yes, that's why I said values in an option set should explicitly be flagged as default values if that's why they're there.

> 18) How this all fits in with REC_TYPE_FT_SPECIFIC_ records needs consideration, currently it is necessary to dissect everything twice to also display in the epan tree. One option would be to allow epan to do things like add comments and update other metadata while doing a safely bounded parse. I'm not sure how this interacts with simpler tools like mergecap though? Do these use libwiretap on its own?

Yes - capinfos, captype, editcap, and mergecap use libwiretap but not libwireshark.

> On the subject of buffers and unions it is also not possible to use a ft_specfic_record_phdr together with another phdr type, and PCAP-NG assumes it is always a PCAP-NG block type code.

*If* the file is a pcapng file, then it will be a pcapng block type code.  If it's some other file type, it'll be some other type code.  The code that handles REC_TYPE_FT_SPECIFIC_EVENT and REC_TYPE_FT_SPECIFIC_REPORT records first needs to look at the file type and, based on that, choose a dissector to which to handle the record.