Wireshark-dev: [Wireshark-dev] Re: Discussion: Untangling the situation with the Darwin process

From: Anders Broman <a.broman58@xxxxxxxxx>
Date: Fri, 25 Apr 2025 15:07:58 +0200


Den fre 25 apr. 2025 kl 02:38 skrev Omer Shapira via Wireshark-dev <wireshark-dev@xxxxxxxxxxxxx>:


> On Apr 24, 2025, at 4:29 PM, Guy Harris <gharris@xxxxxxxxx> wrote:
>
> On Apr 24, 2025, at 2:56 PM, Omer Shapira via Wireshark-dev <wireshark-dev@xxxxxxxxxxxxx> wrote:
>
>> Unfortunately, when the support for the process metadata was added, the team missed the opportunity to do the right thing and to use CUSTOM block for the metadata, and instead used the LOCAL block id (with the MSB set). This was before my time, and as far as I can tell,  this decision was not made in spite, but due to the lack of context and the general sense of urgency.
>
> Unless it was done before custom blocks were in the pcapng spec, in which case it was done because there was no alternative.

That’s possible, as I mentioned, I don’t have the full view of the reasons behind it.

>
>> There is, however, an issue with the ossification of the current process metadata encoding. During the 12-13 years since the introduction of the process metadata, there are hundreds of thousands of existing pcap files that contain the metadata in the LOCAL block, and this is just in a single company. It is not realistic to expect the engineers to re-encode the “legacy” pcap files.
>>
>
>> Because of that, I believe that while it is possible to “seal” the structure of the current process metadata block, there is no alternative but to continue supporting this, indefinitely.
>
> Yes.  Anything pcap or pcapng-related ends up living forever *even if replaced*, due to old capture files.
>
>> From the perspective of today, I would like to get to a situation where:
>>
>> 1. Darwin tcpdump uses the standards-conforming representation for the metadata when writing new files.
>> 2. It is possible to gradually extend the metadata in a standards-conforming way.
>> 3. Developers can use Wireshark to filter the traffic by the process metadata.
>> 4. Developers can use Wireshark with both the “legacy” files, and the new files.
>
> 1) means "Apple implements that".

Yup, this is the only way to make Darwin tcpdump to do stuff. Someone on my team (possibly me) will have to implement that.


> 2) means "use options to add new metadata".

This is my understanding, thanks for confirming. 


> 3) means either "have dissection code for the metadata blocks, *and* have a way for the dissection of packets associated with a given process include the process metadata" or "have some way for the Wireshark packet filter language specify fields from blocks pointed to by the packet block (which would also allow filtering on Interface Description Block fields).

What I have in mind is the second: allow the engineers to do stuff like
a. $ tshark -r  file.pcapng -T fields -e darwin.process_id -e darwin.interface
b. $ tshark -r file.pcang -Y ’tcp.port == 6040 && darwin.flags.wake_pkt’
c. Same in Wireshark

One idea for presenting the data is to put it in the frame data section. Another idea is to change the packet list to present
the pcap-ng blocks at the lowest level which could be useful to put non packet blocks in the list such as IDB:s statistical blocks and "events". May be problematic for other capture file types.

 

>
> 4) means "have support for both the local-use block and the custom block".
>
>> Since there are more than billion of devices running Darwin in some shape or form, and since there were more than ten years to collect potentially valuable pcap files on those devices, it is important to preserve the ability to dissect the “legacy” files. It unrealistic to expect that all the legacy files can be converted to a new representation; there is just too much of that legacy.
>>
>> My understanding is that to support the “legacy” representation of Darwin metadata, Wireshark will need to start treating `0x80000001` as an ossified quasi-standard block type, and to at least attempt to decode this as Darwin process metadata.
>
> Yes.
>
>> Moreover, due to the quantity of the “legacy” pcap files, it might be a pragmatic idea to mention the block 0x80000001 as an “exception” in https://datatracker.ietf.org/doc/draft-ietf-opsawg-pcapng/ , so that future developers would skip this.
>
> I.e., mark it as "used by Apple" to avoid having other people who (because they lack a Private Enterprise Number, or whatever) choose to use local-use blocks use that *particular* value?  If so, that's an issue for the pcapng spec.

That’s a possibile way to proceed, but I am not sure whether this is the *best* way to proceed. Another possibilities:
1. Add a preference to say that 0x80000001 always means Darwin PIB.
2. Add a heuristic to attempt to parse 0x80000001 as Dawin PIB, if successful mark the file as created by tcpdump.
3. …

In the example trace I have the sector header block contains information enough to understand that this is an "apple" trace.
So heuristic should be doable.

> If all of the support for that local-use block were implemented by plugins, anybody who ignores
>
>       https://datatracker.ietf.org/doc/html/draft-ietf-opsawg-pcapng#name-supported-use-cases
>
> "There are two different supported use-cases for vendor-specific custom extensions: local and portable. Local use means the custom data is only expected to be usable on the same machine, and the same application, which encoded it into the file. This limitation is due to the lack of a common registry for the local use number codes (the block or option type code numbers with the Most Significant Bit set). Since two different vendors may choose the same number, one vendor's application reading the other vendor's file would result in decoding failure. Therefore, vendors SHOULD instead use the portable method, as described next."
>
> and uses 0x80000001 as a block number could remove or otherwise disable the plugins and provide their own.  I'm not sure we support that yet, but, if we don't, that's a bug and should be fixed.
>
>> I would like to hear the opinions of  the core developers on the above. Again, I am not super proud that mistakes have happened, an d I am trying to fix it going forward, but please keep in mind the backwards compatibility.
>
> I think that 1) going with a custom block, for now, is the right way to go, 2) it should be possible for Wireshark to continue to support Apple's legacy local-use process metadata block, and 3) it would be nice if overriding the latter support were possible.
>
> "Would be nice" is less strong than "should", so 3) must not get in the way of 2).
>
> (I say "for now" for 1) because various proposals have been made for process information; at some point it'd be good to have it as a standard block type that can support any operating system with networking support, from TinyIoTRealTimeMultitaskingOperatingSystem to UN*Xes, Windows, and z/OS. Again, that doesn't mean dropping support for the custom block, as that's another case of "it lives forever".)

Sounds like we are on the same page.


I would like to hear from more people, but my tentative plan is to proceed in three or maybe four steps:


Step one: Wireshark land.
I want to make sure that the legacy 0x80000001 is supported, in the sense that there are new frame fields that contain the darwin process metadata, if present. I know that Jim Young has done work on that in the past, I will touch base with him, but even if not, the support should be quite straightforward.

Rudimentary code on the verge of reading the new block here 
  1. !19675
 

Step two: Darwin tcpdump land.
I will work out a custom block format for tcpdump that would be easy to work with, and make sure that a future version of tcpdump would support that. I can not, per Apple’s policy, say when things will be shipped, but I am pretty sure that it will _not_ be shipped until the upcoming version of macOS, and even then, unlikely to go out in 16.0 or 16.1. But I will have time to dogfood it.


Step three: Wireshark land.
I will add support for the new custom block to Wireshark, so that it will support the new custom block as well as the legacy.


Step four (optional): IETF
I will propose making 0x80000001 an exception in the pcapng spec, using the “SHOULD NOT” language.


> _______________________________________________
> Wireshark-dev mailing list -- wireshark-dev@xxxxxxxxxxxxx
> To unsubscribe send an email to wireshark-dev-leave@xxxxxxxxxxxxx

_______________________________________________
Wireshark-dev mailing list -- wireshark-dev@xxxxxxxxxxxxx
To unsubscribe send an email to wireshark-dev-leave@xxxxxxxxxxxxx