Wireshark-dev: [Wireshark-dev] Re: Discussion: Untangling the situation with the Darwin process

From: Omer Shapira <omer_shapira@xxxxxxxxx>
Date: Fri, 25 Apr 2025 11:30:47 -0700


On Apr 25, 2025, at 6:07 AM, Anders Broman <a.broman58@xxxxxxxxx> wrote:



Den fre 25 apr. 2025 kl 02:38 skrev Omer Shapira via Wireshark-dev <wireshark-dev@xxxxxxxxxxxxx>:


> On Apr 24, 2025, at 4:29 PM, Guy Harris <gharris@xxxxxxxxx> wrote:
> 
> On Apr 24, 2025, at 2:56 PM, Omer Shapira via Wireshark-dev <wireshark-dev@xxxxxxxxxxxxx> wrote:
> 
>> Unfortunately, when the support for the process metadata was added, the team missed the opportunity to do the right thing and to use CUSTOM block for the metadata, and instead used the LOCAL block id (with the MSB set). This was before my time, and as far as I can tell,  this decision was not made in spite, but due to the lack of context and the general sense of urgency.
> 
> Unless it was done before custom blocks were in the pcapng spec, in which case it was done because there was no alternative.

That’s possible, as I mentioned, I don’t have the full view of the reasons behind it. 

I just double checked, this was indeed the case. There were no custom blocks back then.


>> 1. Darwin tcpdump uses the standards-conforming representation for the metadata when writing new files.
>> 2. It is possible to gradually extend the metadata in a standards-conforming way.
>> 3. Developers can use Wireshark to filter the traffic by the process metadata.
>> 4. Developers can use Wireshark with both the “legacy” files, and the new files.
> 
> 1) means "Apple implements that".

Yup, this is the only way to make Darwin tcpdump to do stuff. Someone on my team (possibly me) will have to implement that.


> 2) means "use options to add new metadata".

This is my understanding, thanks for confirming. 


> 3) means either "have dissection code for the metadata blocks, *and* have a way for the dissection of packets associated with a given process include the process metadata" or "have some way for the Wireshark packet filter language specify fields from blocks pointed to by the packet block (which would also allow filtering on Interface Description Block fields).

What I have in mind is the second: allow the engineers to do stuff like
a. $ tshark -r  file.pcapng -T fields -e darwin.process_id -e darwin.interface 
b. $ tshark -r file.pcang -Y ’tcp.port == 6040 && darwin.flags.wake_pkt’ 
c. Same in Wireshark

One idea for presenting the data is to put it in the frame data section. Another idea is to change the packet list to present
the pcap-ng blocks at the lowest level which could be useful to put non packet blocks in the list such as IDB:s statistical blocks and "events". May be problematic for other capture file types.


I was thinking about expanding the frame data section, so that this information will be accessible at every layer.
 
> 
> 4) means "have support for both the local-use block and the custom block".
> 
>> Since there are more than billion of devices running Darwin in some shape or form, and since there were more than ten years to collect potentially valuable pcap files on those devices, it is important to preserve the ability to dissect the “legacy” files. It unrealistic to expect that all the legacy files can be converted to a new representation; there is just too much of that legacy.
>> 
>> My understanding is that to support the “legacy” representation of Darwin metadata, Wireshark will need to start treating `0x80000001` as an ossified quasi-standard block type, and to at least attempt to decode this as Darwin process metadata.
> 
> Yes.
> 
>> Moreover, due to the quantity of the “legacy” pcap files, it might be a pragmatic idea to mention the block 0x80000001 as an “exception” in https://datatracker.ietf.org/doc/draft-ietf-opsawg-pcapng/ , so that future developers would skip this.
> 
> I.e., mark it as "used by Apple" to avoid having other people who (because they lack a Private Enterprise Number, or whatever) choose to use local-use blocks use that *particular* value?  If so, that's an issue for the pcapng spec.

That’s a possibile way to proceed, but I am not sure whether this is the *best* way to proceed. Another possibilities:
1. Add a preference to say that 0x80000001 always means Darwin PIB.
2. Add a heuristic to attempt to parse 0x80000001 as Dawin PIB, if successful mark the file as created by tcpdump.
3. …

In the example trace I have the sector header block contains information enough to understand that this is an "apple" trace.
So heuristic should be doable.

This was my thinking as well, at least as a starting step.


Step one: Wireshark land.
I want to make sure that the legacy 0x80000001 is supported, in the sense that there are new frame fields that contain the darwin process metadata, if present. I know that Jim Young has done work on that in the past, I will touch base with him, but even if not, the support should be quite straightforward. 

Rudimentary code on the verge of reading the new block here 
  1. !19675

Thanks a lot! I will send a draft MR soon, to discuss the details.

Step two: Darwin tcpdump land.
I will work out a custom block format for tcpdump that would be easy to work with, and make sure that a future version of tcpdump would support that. I can not, per Apple’s policy, say when things will be shipped, but I am pretty sure that it will _not_ be shipped until the upcoming version of macOS, and even then, unlikely to go out in 16.0 or 16.1. But I will have time to dogfood it.

Update: I discussed this internally and got thumbs up from the team. It will be great to have an evolution path for adding more metadata that is valuable for debugging networking issues. One thing we need to be mindful of is the overall size of the pcapng file, due to the constraints of the embedded environments. 


Step three: Wireshark land. 
I will add support for the new custom block to Wireshark, so that it will support the new custom block as well as the legacy.


Step four (optional): IETF
I will propose making 0x80000001 an exception in the pcapng spec, using the “SHOULD NOT” language.

I am very much on the fence about this step; as of now I am postponing it.



> _______________________________________________
> Wireshark-dev mailing list -- wireshark-dev@xxxxxxxxxxxxx
> To unsubscribe send an email to wireshark-dev-leave@xxxxxxxxxxxxx

_______________________________________________
Wireshark-dev mailing list -- wireshark-dev@xxxxxxxxxxxxx
To unsubscribe send an email to wireshark-dev-leave@xxxxxxxxxxxxx