Wireshark-dev: Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
From: Anders Broman <a.broman@xxxxxxxxxxxx>
Date: Thu, 18 Apr 2013 20:42:27 +0200
Evan Huus skrev 2013-04-18 18:28: Just throwing in some more stuff :-)- It would be nice to have a reference trace to test performance against, memory usage and execution time. - As a start of performance testing one could remove the reassembled data from it's hash table and store it in per-packet-data for faster access this date could perhaps be in a file later.
This might require redesigning the per-packet-data functionality to keep track of the level in the packet protocol stack as say IP might occur more than once in a packet. The "protocols in frame" string functionality might be redesigned for this purpose. Currently it is only built if there is a tree I think.
Regards Anders
A few misc notes on this topic in no particular order: - Once everything is converted to wmem (after 1.10 branches) it would be trivial to write a backend allocator that collected statistics on memory usage. - Has anybody ever tried to see if Massif (http://valgrind.org/info/tools.html#massif) gives any interesting data on memory usage? That's what it's for. - Our reassembly code is a bit of a mess anyways, as Guy's recent commit indicates. It could use a general cleanup and simplification just on general principle. Cheers, Evan On Thu, Apr 18, 2013 at 12:13 PM, Anders Broman <a.broman@xxxxxxxxxxxx> wrote:Jeff Morriss skrev 2013-04-18 17:55:On 04/15/13 10:01, Ambarisha B wrote:Hi dev, I am a final year engineering student pursuing my bachelors in Computer Science. I was going through the GSoC'13 ideas page and found "Filebacked-tvbuffs" interesting, so I looked it up. Here's a (probably not so) short summary of what I did and understood. I'm only a novice, so if I've got something wrong, please, enlighten me. I went through the (interesting) archived conversation linked on the ideas page. I've realized most of the discussion was about "how to deal with large captures, so that users don't have to break up the captures". Swapping or if needed mmaped files would help. But since the goal of this project is to cut down the memory usage, I guess we're looking at non-mmaped files. The project description says that data in packet-bytes view and packet-details view is duplicate of that on the disk. I tried to look this up in the code. So, originally the data is in a capture_file and wtap_*() gets the data out of that and it is finally handed to dissect_packet() which actually makes the tvbuff out of it and passes to the sub-dissectors(dissect_frame etc).Yes. But the stuff in the packet-details view isn't what I consider to be the problem (normally): that stuff is only kept in memory as long as it's on your screen. The real problem (which I thought file-backed-tvbuffs might solve) would be when dissectors have to make copies of tvbuffs in order to do, for example, reassembly. Those copies are malloc()'d and it is believed that, in some situations, they account for a lot of Wireshark's memory usage.Yes file backed tvbs might not have been such a great idea as Jeff points out the problem to be solved is probably the reassembled packets memory usage one also has to make sure that the tradeoff in speed isn't a problem (if any). Writing a new file on Wiresharks first pass with the reassembled data attached which will be read for any subsequent access might be the answer.(A good side project would be to add some tracking to Wireshark's memory allocations so we could be sure how much of a problem this is. For example, a while ago someone pointed out that actually a huge amount of memory goes to storing frame_data's.)I thought about this too, would it be possible to invent a hash table registry function which then could be used to enquire the hash table sizes and display it in the GUI?Anyway, if reassembly could be done using composite + file-backed tvbuffs then a lot of that alloc'd memory could go away.I think I now have an idea of how I would back up tvbuff by a hard disk. We add another "type" of tvbuff which is backed up by a file, the same way TVBUFF_SUBSET is backed by another tvbuff. Next we think about "how to back it by a file?". Ofcourse, we can implement a neat cache in the tvb layer itself, tuned for our accesses. But I have a couple of thoughts on this. Do tell me, if I am missing something here. If we are accessing all the data in the tvbuff in one shot, there wouldn't be much use of a cache. Infact, it'll add housekeeping overhead. On the other hand, if we're making small repeated accesses to the data, a no-cache implementation would be pitifully slow. For this I need to look at usage of tvbuffs in those two views more closely. Also, now that there's this abstraction, the interface for accessing filebacked-tvbuff has to be a little different than normal tvbuffs (because the data access might require some housekeeping as opposed to the direct access of tvb->real_data+offset).I suspect there would have to be *some* amount of caching: for example we really wouldn't want to go off and read one byte off of the disk each time someone calls tvb_get_guint8(). I would expect that normally a tvbuff will have a lot of accesses in a very short period of time, then no accesses for quite a while, then another burst of accesses (corresponding to the frame or PDU in question being dissected when the file is read and then not accessed again until the user clicks or scrolls past the frame in question).I thought I should talk to you guys first, because I could be going on a wild-goose-chase with this. If there's something you want me to take a look at or study, please do let me know. Also, if you can point me to a little bug, so that I can get my hands dirty, that'll be great.I doubt there's much in the way of a bug to look at; I think to get your hands dirty you'd have to start digging into how, for example, the tvbuffs and reassembly work and see if it can be put together. ___________________________________________________________________________ Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx> Archives: http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe___________________________________________________________________________ Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx> Archives: http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe___________________________________________________________________________ Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx> Archives: http://www.wireshark.org/lists/wireshark-dev Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
- Follow-Ups:
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Guy Harris
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- References:
- [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Ambarisha B
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Jeff Morriss
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Anders Broman
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Evan Huus
- [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- Prev by Date: Re: [Wireshark-dev] Export higer level PDUs, "Unbundled PDUs" decrypted PDUs etc
- Next by Date: [Wireshark-dev] Copying TVBs for Reassembly [Was: Filebacked-tvbuffs : GSoC'13]
- Previous by thread: Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- Next by thread: Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- Index(es):