Wireshark-bugs: [Wireshark-bugs] [Bug 9607] TFShark (Terminal FileShark)
Date: Wed, 01 Jan 2014 18:58:47 +0000
Comment # 12
on bug 9607
from Evan Huus
(In reply to comment #11) > (In reply to comment #10) > > I think > > you're over-complicating things if I understand correctly. Wiretap's job is > > to abstract all the different capture formats into a single API providing a > > list of packets + extra stuff like name-resolution blocks. For tfshark we > > have no "pre-processing" to do before we hand the data to the dissectors - > > all we need is an API that abstracts the file into a series of bytes, which > > is what fopen and friends already do. > > I was certainly worried about starting to over-complicating things, but I > thought one of Wiretap's jobs was to provide a "header" and an array of > "records" (aka packets), so I thought Filetap was necessary to provide "a > generic version" of that for non-capture files. My understanding of our current packet architecture is: We start with a file-handle, which gets associated with a capture_file struct. Wiretap turns that into some metadata (ie capture device, timestamp, etc) and an array of packet records. Each packet gets a frame_data, a wtap_pkthdr, a tvb, a pinfo, etc that are used for dissection of that packet. For non-capture files, they may not be record-based and we have much less "generic" metadata (basically just the filename and whatever fstat gives us). This means we can't reliably talk about frames or headers in any generic way - all of that information must be dependent on the dissector, since it can be totally different for every file type. So for fileshark the simpler method is: We start with a file-handle, which gets turned into a tvb. That tvb is passed to the dissector. All of the intermediate layers of capture_file struct, wiretap, record headers etc aren't applicable. > > In my mind, the simplest way forward > > is first to implement epan/tvbuff_file.c which fopens a file and uses > > seek/read to implement the TVB interface (I don't expect this to be too > > hard, Jakub knows this interface best). > > But file.c already provides seek/read functionality within a file itself, > it's just uses a capture_file structure (which I don't really want). This > sits below wiretap (which is below tvb), and I wanted to somehow put filetap > between file.c (functionality) and the tvb. The file.c code, on a quick skim, appears entirely packet-oriented. I don't know how much of it (if any) can be reused. > > Then the normal "dissect" path in > > tfshark becomes very simple: > > create a tvb_file backed by the file we are > > dissecting create dummy wtap_pkthdr and frame_data structs (probably just > > memzeroed is fine) > > pass all of the above to epan_dissect_run() > > Wiretap/filetap/etc never need to get involved. > > > Does this make sense? > > I can see this is a simpler design, but I thought we wanted to leverage > frame_data structs as a "generic record/frame" and discard (preferably > remove) wtap_pkthdr in favor of a "generic header" (data blob) within > fileshark. Are you seeing my design as more of a "long term solution" or do > you think it will always be "too complicated"? I'm not sure what you mean here. Epan is fundamentally record-based, and files are not, so we can fake it by wrapping the entire file in a single fake "record". If I were writing fileshark from scratch I wouldn't use records/frames at all. --- These architectural questions are why I was hesitant earlier to just start dumping file dissectors into Wireshark without some sort of separation (sorry Michal Labedzki if I explained that poorly at the time); the requirements are very similar, but also very different and not always compatible. --- Taking a step back for a slightly bigger picture, we have two possible general approaches: - On one approach we leave as much as possible of wireshark the way it is and put some hacks (dummy header values, etc) in fileshark to make it work (ie pretending that the file is one big "packet"). Fileshark becomes a second-class citizen in one sense, but it's the least amount of work for a functional program. The question is then if this is really "good enough" or if we'll eventually hit incompatible feature requirements and be unable to hack our way around them. - On the other approach we extract all the components of epan and/or wiretap that are also useful to file dissection into a separate library (epan-base?) with a totally file/packet-agnostic API. epan then makes use of epan-base to provide its current packet-specific API, and we write a small "efan" library to provide a file-specific API on top of epan-base. On either approach I don't see that wiretap's abstractions are useful to fileshark - on the first approach we'll have to fake some wiretap structs for compat with epan's current API; on the second approach we'll have a totally wiretap-free fileshark. The second approach is obviously architecturally cleaner, but it's a lot more work and a lot more disruptive, meaning it needs strong buy-in from the rest of the devs. The first approach gets us to a working program without affecting Wireshark-proper too much; if it doesn't pick up then it can be killed or 3rd-partied without too much effort. If it does pick up then hopefully we will have manpower to gradually migrate to the second model.
You are receiving this mail because:
- You are watching all bug changes.
- Prev by Date: [Wireshark-bugs] [Bug 9607] TFShark (Terminal FileShark)
- Next by Date: [Wireshark-bugs] [Bug 9607] TFShark (Terminal FileShark)
- Previous by thread: [Wireshark-bugs] [Bug 9607] TFShark (Terminal FileShark)
- Next by thread: [Wireshark-bugs] [Bug 9607] TFShark (Terminal FileShark)
- Index(es):