Ethereal-users: Re: [Ethereal-users] HTTP Dissector & reassembler, tethereal, and mirroring a we
Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.
From: Jon Passki <cykyc@xxxxxxxxx>
Date: Wed, 16 Feb 2005 19:29:18 -0800 (PST)
--- Guy Harris <gharris@xxxxxxxxx> wrote: > Jon Passki wrote: > > > While doing off-line analysis of some HTTP traffic, I would > like to > > reconstruct the results back into a webpage. I understand the > GUI > > has the TCP reassembly [1,2,3], plus the HTTP dissector > understands > > data such as JPEGs. > > "Understands" in the sense that it can dissect the structure of a > JPEG > file; it doesn't "understand" it in the sense of being able to > display > the image. (Also, the HTTP dissector only "understands" that > "image/jpeg" means that the entity body should be handed to the > JPEG > dissector - which it knows because the JPEG dissector has > registered > itself with a media type of "image/jpeg". Is it correct to say that the HTTP dissector might call other dissectors based upon the media type encountered in an HTTP session? Is there a listing of available dissectors (outside of code)? > > What I'd like to do is feed a pcap session > > into tethereal, reconstruct an HTTP session, and have the HTTP > > dissector magically spit out a web page. > > > > To do this seems non-trivial to me, since there might be > multiple > > TCP sessions for one web page (e.g. a JPEG download). > > By "Web page" do you mean "page displayed by a Web browser"? If > so, > then that's not really a concept that exists at the HTTP layer, > and > thus, it's not really something that the HTTP dissector should be > doing. By a web page, I mean a hierachal representation of the media type data (e.g. HTML [text/html], JPEG [image/jpeg], etc) within the HTTP session. I see now that it probably wouldn't make sense in the HTTP dissector. Perhaps this could be a feature on exporting the data? E.g.,when a JPEG is exported from an HTTP session, somewhere (filename, companion file, directory structure, whatever) there is information that I can use to associate it to a larger group of sessions. This could be the absolute URI or absolute path and Host field, time & date, and/or whatever else makes sense. > A tap could perhaps be used to gather together various HTTP > entities > that could be considered the components of a Web page, but I'm > not sure > what it'd do with them after that. Is there some representation > of a > Web page, in that sense, as a single file? If not, what would > the tap > in question do with that the components of the page to "spit out > a Web > page"? I hopefully answered this above. For me, it does not necessarily need to create an HTML document that can be easily loaded in a web browser, but that would be ideal. If there were for example three HTTP sessions created by a web browser to render some HTML page, I would like to have those sessions exported and grouped together somehow so I would know that they're logically connected. An added bonus is that I could use Firefox to review the data, but that's not necessary. > > So, I'd > > assume a state machine of some sort. Example: the initial page > had > > some image src, so the state machine would check to see if > there > > were any HTTP requests for the link. Then this has the added > > difficulty that time would be the only thing to separate > multiple > > downloads of the same file (JPEG Session 1 was 10 seconds > later, > > JPEG Session 2 was 60 seconds later, JPEG Session 3 was 120 > seconds > > later - use JPEG Session 1). > > > > So, does this functionality exist? > > No. > > > If so, what did I miss in reading up on reassembly? > > None of that has anything to do with "reassembly" in Ethereal's > sense of > the word. "Reassembly", in Ethereal's sense, refers to > assembling the > parts of a higher-level packet that are contained in multiple > lower-level packets, e.g. reassembling fragments of a fragmented > IP > datagram, reassembling the parts of an HTTP request or reply > split > across multiple TCP segments, etc.. There's no notion of a "Web > page" > at the HTTP layer or any other protocol layer, so there's no > notion of > "reassembly" of a Web page at the protocol layer, so the existing > > reassembly code wouldn't help. Gotcha. Didn't think things properly through (hopefully did now). Here's an example scenario: In Quality Assurance (QA) testing we automate 100 tests against a systems, with 15 being HTTP related. The logic about HTTP is primative and to add logic or change the testing tools is currently not possible (if I could, this is where I would start). The responses aren't captured, just a brief pass / fail type messages. The traffic, though, is captured in pcap format. When doing some verification of the tests, we need to look at the pcap dump to see what really came across since the output test data is useless. Since all I have is pcap I'm looking at tcpflow, {t}ethereal, and whatever other tool that can reassemble the TCP, HTTP, and whatever basic media types that may be included in the HTTP session. Thanks again for your time on this, Jon __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
- Follow-Ups:
- References:
- Prev by Date: Re: [Ethereal-users] rolling over capture file
- Next by Date: Re: [Ethereal-users] Saving rtp payload as sound files
- Previous by thread: Re: [Ethereal-users] HTTP Dissector & reassembler, tethereal, and mirroring a web site
- Next by thread: Re: [Ethereal-users] HTTP Dissector & reassembler, tethereal, and mirroring a web site
- Index(es):