Wireshark-dev: Re: [Wireshark-dev] Reassembly of IP fragments gets confused by multiple packets

From: Anders Broman <anders.broman@xxxxxxxxxxxx>
Date: Mon, 8 Feb 2016 16:51:21 +0000

-----Original Message-----
From: wireshark-dev-bounces@xxxxxxxxxxxxx [mailto:wireshark-dev-bounces@xxxxxxxxxxxxx] On Behalf Of Guy Harris
Sent: den 6 februari 2016 20:10
To: Developer support list for Wireshark
Subject: Re: [Wireshark-dev] Reassembly of IP fragments gets confused by multiple packets on different VLANS

On Jan 20, 2016, at 8:43 AM, Anders Broman <anders.broman@xxxxxxxxxxxx> wrote:

> Trying to summarize…
>  
> captured on the "all" interface of a Linux machine acting as a router, or merged two captures from networks on different sides of a router.
>  
> various sorts of tunneling

Including, for example, MPLS pseudo-wires.

> (or "other sorts of tunneling", if you view VLANs as a form of 
> tunneling)

That might be the best way to think about VLANs - treat them the same way various tunnels, pseudo-wires, etc. are treated.

On the other hand, as a comment:

	https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=4561#c5

in the bug you're quoting says:

	For VLAN I know cases where UL traffic and DL traffic belong to different VLAN, still to the same flow. Thus a differentiation according to VLAN id must be configurable.

so VLANs shouldn't always be treated as tunnels.

I don't know whether that's an issue for to any other form of pseudo-wire or tunnel - or even for physical networks, as identified by interface IDs; I could imagine a case where a machine has multiple physical interfaces on the same network and different fragments of an IP datagram, or different packets in a given TCP connection, or... might travel on different interfaces.

So is this really just a question of

	1) "wires", whether physical or virtual (a VLAN being a "virtual wire") or pseudo-wire or...

and

	2) whether the machine on which the capture is being done routes traffic between "wires" or not?

If the machine has multiple "wires" (or radio channels) and *is* routing between them, a capture that sees traffic from multiple "wires" might see multiple copies of network-layer packets on different "wires".

If it's not routing between them, and is, for example, using different "wires" for an uplink and a downlink, or has bonded multiple "wires" together and is sending traffic to another machine over multiple different "wires", then fragments from a given fragmented IP datagram, or packets on a given TCP connection, or... might appear on different "wires".

In the first case, you want to distinguish between "wires" when doing reassembly/TCP analysis and desegmentation/etc..

In the second case, you *don't* want to do that.

Would that be sufficient to handle the two cases?  Or are there cases where the machine might be routing between some sets of "wires" and using other sets of "wires" as parallel pipes to and from a given destination host, so that whether the "wire" should be taken into account depends on the "wire"?

> The right generalization might be to have some sort of "network tag" which incorporates a network interface ID plus all VLAN tags for the packet ("all VLAN tags" to handle QinQ).
>  
>  
> So if we go for network tag, or key should that be created by Outer 
> VLAN tag, Hash of Source MAC, protocol-level, interface index(Pcap-ng)?
>  
> Outer VLAN tag should take care of, VLAN and QinQ, right?
> Source MAC should take care of, “duplicate caused by mirroring” and 
> alike(?)
> Pinfo- curr_layer_num Should take care of tunneling(?) Interface index 
> should take care of ANY interface traces(?)
>  
> What size should the key be, is 32bits enough?

If we have a notion of "wire", perhaps we could assign sequential internal numbers to "wires" - just as we do for conversations - and use the internal "wire" number.

We could also use that for filtering, e.g. "show me everything on this "wire"".

(And maybe we could have a sidebar in Wireshark, showing all the "wires", conversations, etc. we've seen, and let you just click on one to show only the packets on that "wire"/in that conversation/etc..)


It seems like there might not be a "solve all" solution to the cases listed, it also seems to me like there is a need for several flavors of "conversation"
Such as the 5 tuple we have today for stuff like "decode as" and possibly other protocol data stored by the protocols running on the transport protocol and
A configurable(?) "conversation" type taking "wire" into consideration used for reassembly TCP analysis response times etc.
Possibly the "wire lookup" could list all "wires" belonging to the same 5 tuple and data could be obtained using the wire key or the 5 tuple key.

It is probably best implemented as a new API used by the TCP dissector alongside the "old" one used for TCP analysis and reassembly as a start.

Best regards
Anders

 ___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
Archives:    https://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe