Wireshark-dev: Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
From: Evan Huus <eapache@xxxxxxxxx>
Date: Thu, 18 Apr 2013 12:28:20 -0400
A few misc notes on this topic in no particular order: - Once everything is converted to wmem (after 1.10 branches) it would be trivial to write a backend allocator that collected statistics on memory usage. - Has anybody ever tried to see if Massif (http://valgrind.org/info/tools.html#massif) gives any interesting data on memory usage? That's what it's for. - Our reassembly code is a bit of a mess anyways, as Guy's recent commit indicates. It could use a general cleanup and simplification just on general principle. Cheers, Evan On Thu, Apr 18, 2013 at 12:13 PM, Anders Broman <a.broman@xxxxxxxxxxxx> wrote: > Jeff Morriss skrev 2013-04-18 17:55: > >> On 04/15/13 10:01, Ambarisha B wrote: >>> >>> Hi dev, >>> >>> I am a final year engineering student pursuing my bachelors in Computer >>> Science. I was going through the GSoC'13 ideas page and found >>> "Filebacked-tvbuffs" interesting, so I looked it up. Here's a (probably >>> not so) short summary of what I did and understood. I'm only a novice, >>> so if I've got something wrong, please, enlighten me. >>> >>> I went through the (interesting) archived conversation linked on the >>> ideas page. I've realized most of the discussion was about "how to deal >>> with large captures, so that users don't have to break up the captures". >>> Swapping or if needed mmaped files would help. But since the goal of >>> this project is to cut down the memory usage, I guess we're looking at >>> non-mmaped files. >>> >>> The project description says that data in packet-bytes view and >>> packet-details view is duplicate of that on the disk. I tried to look >>> this up in the code. So, originally the data is in a capture_file and >>> wtap_*() gets the data out of that and it is finally handed to >>> dissect_packet() which actually makes the tvbuff out of it and passes to >>> the sub-dissectors(dissect_frame etc). >> >> >> Yes. But the stuff in the packet-details view isn't what I consider to be >> the problem (normally): that stuff is only kept in memory as long as it's on >> your screen. The real problem (which I thought file-backed-tvbuffs might >> solve) would be when dissectors have to make copies of tvbuffs in order to >> do, for example, reassembly. Those copies are malloc()'d and it is believed >> that, in some situations, they account for a lot of Wireshark's memory >> usage. >> > Yes file backed tvbs might not have been such a great idea as Jeff points > out the problem to be solved is > probably the reassembled packets memory usage one also has to make sure that > the tradeoff in speed > isn't a problem (if any). Writing a new file on Wiresharks first pass with > the reassembled data attached > which will be read for any subsequent access might be the answer. > > >> (A good side project would be to add some tracking to Wireshark's memory >> allocations so we could be sure how much of a problem this is. For example, >> a while ago someone pointed out that actually a huge amount of memory goes >> to storing frame_data's.) > > I thought about this too, would it be possible to invent a hash table > registry function which then could > be used to enquire the hash table sizes and display it in the GUI? > > >> >> Anyway, if reassembly could be done using composite + file-backed tvbuffs >> then a lot of that alloc'd memory could go away. >> >>> I think I now have an idea of how I would back up tvbuff by a hard disk. >>> We add another "type" of tvbuff which is backed up by a file, the same >>> way TVBUFF_SUBSET is backed by another tvbuff. Next we think about "how >>> to back it by a file?". Ofcourse, we can implement a neat cache in the >>> tvb layer itself, tuned for our accesses. But I have a couple of >>> thoughts on this. Do tell me, if I am missing something here. >>> >>> If we are accessing all the data in the tvbuff in one shot, there >>> wouldn't be much use of a cache. Infact, it'll add housekeeping >>> overhead. On the other hand, if we're making small repeated accesses to >>> the data, a no-cache implementation would be pitifully slow. For this I >>> need to look at usage of tvbuffs in those two views more closely. Also, >>> now that there's this abstraction, the interface for accessing >>> filebacked-tvbuff has to be a little different than normal tvbuffs >>> (because the data access might require some housekeeping as opposed to >>> the direct access of tvb->real_data+offset). >> >> >> I suspect there would have to be *some* amount of caching: for example we >> really wouldn't want to go off and read one byte off of the disk each time >> someone calls tvb_get_guint8(). >> >> I would expect that normally a tvbuff will have a lot of accesses in a >> very short period of time, then no accesses for quite a while, then another >> burst of accesses (corresponding to the frame or PDU in question being >> dissected when the file is read and then not accessed again until the user >> clicks or scrolls past the frame in question). >> >>> I thought I should talk to you guys first, because I could be going on a >>> wild-goose-chase with this. If there's something you want me to take a >>> look at or study, please do let me know. Also, if you can point me to a >>> little bug, so that I can get my hands dirty, that'll be great. >> >> >> I doubt there's much in the way of a bug to look at; I think to get your >> hands dirty you'd have to start digging into how, for example, the tvbuffs >> and reassembly work and see if it can be put together. >> >> >> ___________________________________________________________________________ >> Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx> >> Archives: http://www.wireshark.org/lists/wireshark-dev >> Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev >> mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe >> > > ___________________________________________________________________________ > Sent via: Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx> > Archives: http://www.wireshark.org/lists/wireshark-dev > Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev > mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
- Follow-Ups:
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Guy Harris
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Anders Broman
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- References:
- [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Ambarisha B
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Jeff Morriss
- Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- From: Anders Broman
- [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- Prev by Date: Re: [Wireshark-dev] Export higer level PDUs, "Unbundled PDUs" decrypted PDUs etc
- Next by Date: Re: [Wireshark-dev] Export higer level PDUs, "Unbundled PDUs" decrypted PDUs etc
- Previous by thread: Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- Next by thread: Re: [Wireshark-dev] Filebacked-tvbuffs : GSoC'13
- Index(es):