Wireshark-dev: [Wireshark-dev] Re: wiretap vs text2pcap

From: Michael Mann <mmann78@xxxxxxxxxxxx>
Date: Tue, 6 May 2025 02:12:38 +0000 (UTC)
This is all very helpful (especially to search the mailing list archives in the future).

Some clarifications/follow ups:
1. I was not looking to write the file back in the original (text based) format.  I was perfectly fine with "open as .trc file, save as .pcap (or .pcapng)" and I thought if I didn't provide write function callbacks in the wiretap, that's how Wireshark would behave.  To me there isn't a (strong) need to "write" back to the original file format, the reading conversion is just to take advantage of Wireshark's superior dissection/filtering for analysis.  And having it being done "inline" in a single step within Wireshark instead of two steps - text2pcap derivation then Wireshark (especially when you may not need an actual pcap file, just the data loaded in memory for analysis)

2. You are correct that I was conflating file formats and metadata as far as the layers between wiretap and dissection.  If WTAP_ENCAP_SOCKETCAN is used, then its "metadata format" needs to be followed and that's what I find a little clunky after looking at the dissection in Wireshark.  Here, I see wiretap as producing a "SocketCAN record", and I thought producing a "pcap or pcapng record" would be better.  I thought the wiretap functionality I was writing was to "massage" into an existing "wtap_encap" format, and I want to shoot for a pcap or pcapng record over SocketCAN to take advantage of the LINKTYPE_LINUX_SLL link-layer format/dissection.

3. Apparently text2pcap was refactored a few times since I wrote my "derived" applications. The most notable being in December 2021 when it was converted to use wtap_dump, which replaced all of the pcapio functions my applications were using. To me the "algorithm" text2pcap used to be (and my applications stayed with because I didn't have a reason to change them):
a) write_file_header() that uses pcapng_write_section_header_block() and pcapng_write_interface_description_block()
b) parse text data out of the file for packets and create them with pcapng_write_enhanced_packet_block()
c) write_file_trailer that uses pcapng_write_interface_statistics_block() to provides statistics of the packet data read

What I was hoping for was a wiretap interface with a similar algorithm where just "b" needed to be adjusted as needed for slight differences in text formats (within the wth->subtype_read callback) and may not need a lexical scanner like text2pcap has.  Looking at the current ui/text_import*.[chl], it seems to focus on actual files and not "records" like I would want for the PEAK wiretap interface.



On Monday, May 5, 2025 at 01:43:19 PM EDT, Guy Harris <gharris@xxxxxxxxx> wrote:


On May 5, 2025, at 8:24 AM, Michael Mann via Wireshark-dev <wireshark-dev@xxxxxxxxxxxxx> wrote:

> There have been several times where I've been given a simple, text-based capture file for Serial or CAN communications.  My (quick and dirty) solution has been to write a text2pcap derived application to convert the file to pcapng format and then view it Wireshark.  The packet dissection support is usually already there, but I have also supplemented with plugins when needed.
>
> However, https://gitlab.com/wireshark/wireshark/-/merge_requests/18894 has shown me the "right" way to handle it - and that's using wiretap. The packet data comes from a CAN bus, so the original thought was to use the SocketCAN file format,

What is "the SocketCAN file format"?

> but WTAP_ENCAP_SOCKETCAN is a little clunky and I think I'd prefer to use the pcapng format

WTAP_ENCAP_SOCKETCAN is a Wiretap link-layer encapsulation, and is used for several file formats, including...

...pcap and pcapng format with the LINKTYPE_CAN_SOCKETCAN link-layer type value:

    https://www.tcpdump.org/linktypes/LINKTYPE_CAN_SOCKETCAN.html

and, in fact, WTAP_ENCAP_SOCKETCAN was *originally created* for LINKTYPE_CAN_SOCKETCAN. The most common file types with WTAP_ENCAP_SOCKETCAN are probably... pcap and pcapng.

> (similar to my text2pcap applications) to pipe it through "better" dissection tables (sll.ltype).

OK, *that's the LINKTYPE_LINUX_SLL link-layer type value for pcap and pcapng:

    https://www.tcpdump.org/linktypes/LINKTYPE_LINUX_SLL.html

whose primary disadvantage relative to LINKTYPE_CAN_SOCKETCAN is that the packet size is significantly larger, with is why it's *not* the default for capturing on CAN bus devices on Linux:

    https://github.com/the-tcpdump-group/libpcap/issues/1052

> The file format also contains "non-packet data" that I would like to eventually convert into other pcapng block types.

...and convert those block types *back* to the PEAK CAN file format when *writing* the file.

libwiretap provides a layer of abstraction above the file formats it supports; it can handle reading and writing file formats that don't map to pcap or pcapng in a simple fashion.

> I looked around a little, but I didn't see any obvious examples in wiretap of how to easily provide a pcapng record.

It's not intended to provide pcap or pcapng or snoop or Microsoft Network Monitor or BLF or ERF or AIX i-trace or... records.

It *does* have a mechanism for producing "file-format-specific" record types - REC_TYPE_FT_SPECIFIC_EVENT and REC_TYPE_FT_SPECIFIC_REPORT.

It *also* has a mechanism for producing pcapng custom blocks - REC_TYPE_CUSTOM_BLOCK.  *Those* have the advantage that, if the user wants to save a *non*-pcap/pcapng format into a pcapng file, it can be done more easily.  What it *does* require is assigning *some* Private Enterprise Number for a given type:

    https://www.iana.org/assignments/enterprise-numbers/

If the organization that owns the file format has a PEN, *and* they're willing to "own" pcapng custom blocks with that PEN, that would be one way.  The custom block should probably begin with a "custom block type for this PEN" field, so that more than one type of custom block can be supported. The organization would manage the space of custom block types.

On the other hand, we could use *the Wireshark Foundation's* PEN:

    https://www.iana.org/assignments/enterprise-numbers/?q=Wireshark

and manage the custom block types ourselves.

> pcapng.c does its own processing and is a bit complex compared to the APIs used in text2pcap.

The "pcap" in "text2pcap" refers to the pcap file format:

    https://ietf-opsawg-wg.github.io/draft-ietf-opsawg-pcap/draft-ietf-opsawg-pcap.html

which is a bit less complex than the pcapng file format:

    https://ietf-opsawg-wg.github.io/draft-ietf-opsawg-pcap/draft-ietf-opsawg-pcapng.html

so, yes, APIs to write out pcapng files will be a bit complex compared to APIs use to write pcap files.

> Looking at other wiretap examples, they seem to have a "file dissection layer" in epan/dissectors that corresponds to the wiretap handling (linking to "wtap_encap" table) before data is passed to a different dissection table for "packet dissection".

That's for non-pcap/pcapng files that have their own stuff that count as metadata; wiretap provides the metadata as a header for the packet data.

This should perhaps be done by having the wiretap read APIs provide, and the wiretap write APIs accept, separate blobs of "file type metadata" and "packet data", so that, for example, an HP-UX nettl file would provide the nettl metadata separately from the packet data, and the WTAP_ENCAP_ type would be the type for the packet data.

The "file type metadata" would be indicated by the file type.  File types are *not* identified by fixed assigned numbers (except for pcap and pcapng), as code using wiretap is, in general, *not* supposed to do specific things for specific file types - wiretap should provide general indications such as "does this file support multiple link-layer types per file?", so that the code using wiretap doesn't need to be updated to handle new file types.  The way to handle *that* is to have either built-in modules or plugins for that file format, providing both wiretap code (which would be handed the file type value when it registers its file type) and file type metadata dissectors (which would register for that in a "file type metadata" dissector.

That would also be used to register to handle a particular type of REC_TYPE_FT_SPECIFIC_EVENT or REC_TYPE_FT_SPECIFIC_REPORT.

> Is pcapng.c the only source of what I'll have to look at as an "example"?  Can anyone provide more pointers on my desire to have "text2pcap functionality in wiretap",

"text2pcap functionality" is "converting a hex dump text file to a capture file". In some sense, that functionality is already *in* several wiretap modules for file format that are text files.

This is independent of "handling file type specific information", which could be done for text *or* binary formats.


> to make it easier to provide wiretaps for future (simple) text-based packet data.


The "text2pcap functionality" is currently in ui/text_import*.[chl], because it's used in *two* places - text2pcap and Wireshark's "import a text file" facility.

It could, for example, be moved into wsutil, or into its own text_import library, to be more conveniently used by text2pcap and Wireshark *and* wiretap.