Wireshark-dev: [Wireshark-dev] Dealing with wrong Content-Types in HTTP
From: Nicolás Alvarez <nicolas.alvarez@xxxxxxxxx>
Date: Mon, 31 May 2021 00:58:21 -0300
Hello developers, While looking at traffic from Apple devices, I'm seeing lots of non-browser HTTP(S) requests and responses that have incorrect Content-Type headers. For example, when an iPhone talks to Apple CoreLocation servers, it sends HTTPS requests with Content-Type: application/x-www-form-urlencoded and a custom binary format in the body (I think it's protobuf-based but I didn't decode it yet). Obviously the "form data" dissector then fails to do anything useful with it. As another example, https://configuration.apple.com/configurations/pep/config/geo/networkDefaults-ios-12.2.plist claims to be application/x-troff-man (?!) despite being XML Plist, probably because their misconfigured web server looks at the .2.* extension and thinks it's a manpage. Finally, many requests and responses use text/xml or application/xml and the content really is XML, but it would be better to use a specific dissector for the specific XML-based format being used. I'm making a dissector for Apple XML Plists and having trouble getting Wireshark to use it instead of the generic XML dissector. This is also an issue with other generic formats like JSON. There are two things Wireshark could have to help with these incorrect headers. First, the *user* should be able to override the MIME type to make Wireshark run a different dissector for an HTTP request or response body. Supposedly this is already possible in HTTP2, since you can select "HTTP2 content type in stream" (http2.streamid) in the "Decode As" dialog. However, I never got this to actually work in practice, and I don't understand how it's supposed to work, since an http2.streamid isn't globally unique, it only makes sense in the context of a specific tcp.stream (why would you want stream ID 3 of *all HTTP2 connections* to have a different content type? and does it apply to request, response, or both?). For HTTP1, there is nothing, and I'm not sure how I would solve it. Perhaps using the URL in Decode As would be good enough for a start? Secondly, I think dissectors should be able to override the MIME type as well. In some cases a heuristic dissector can guess the format from the contents, but currently the HTTP dissector only calls heuristic dissectors for the body if nothing else worked. For example, if there was a heuristic dissector for CoreLocation responses, the HTTP dissector wouldn't even try it, because it already found a registered dissector for MIME type application/x-www-form-urlencoded and used that. Do we need a "try heuristic sub-dissectors first" preference for HTTP, like TCP has? Other cases are worse, because it's not easy to detect the format from the contents or it would be expensive or error-prone to try those heuristics on every single HTTP body. However, it may be a proprietary format/protocol used with a specific server, in which case a sub-dissector could make decisions based on the URL or other headers. For example, the hypothetical dissector for Apple's Proprietary CoreLocation Format could accept a packet if the URL is gs-loc.apple.com/clls/wloc, rather than looking at the actual bytes. Currently there seems to be no infrastructure to do this (and a dissector table with hostnames wouldn't be enough). Media type dissectors don't seem to even have access to the URL. Does it seem like a good idea to add it? Maybe HTTP can put that information in proto_data for heur dissectors to look at? Thoughts welcome :) -- Nicolás
- Prev by Date: Re: [Wireshark-dev] Calling a dissector: Type for data parameter
- Next by Date: [Wireshark-dev] wiki.wireshark.org - down ?
- Previous by thread: Re: [Wireshark-dev] Calling a dissector: Type for data parameter
- Next by thread: [Wireshark-dev] wiki.wireshark.org - down ?
- Index(es):