Wireshark-dev: Re: [Wireshark-dev] RFD: The Future of Memory Management in Wireshark

From: Evan Huus <eapache@xxxxxxxxx>
Date: Sat, 27 Oct 2012 15:09:32 -0400
On Sat, Oct 27, 2012 at 2:17 PM, Dirk Jagdmann <doj@xxxxxxxxx> wrote:
>> General thoughts from the list on whether or not this would be a good idea?
>
> some general comments on the whole wmem idea:
> memory allocation is done almost everywhere in Wireshark, also in somewhat
> hidden places. For example all the proto_tree_add_* functions will need to
> allocate memory. It will be a real pain to pass down the current allocator
> object into each of those function invocations, you're looking at changing a
> considerable amount of code.

Yes, it wouldn't be a small job.

> Also when thinking about writing dissector code,
> it's probably not useful to mix different allocators while dissecting a protocol.

Not sure what you mean here. Mixing ep_ and se_ memory? Or mixing
g_alloc and mmap memory? The separation of the allocators (glib, mmap,
etc.) from the front-end isn't driven by a desire to mix and match
allocators during a specific dissection (I agree that would be odd).
It was more driven by the fact that the current allocators are all
jumbled together and thus hard to maintain. It also makes it easy to
add new scopes where necessary without writing hundreds of new wrapper
functions.

> An alternative idea (although a little more ugly) is to use a thread local
> global variable which contains pointers to the current ep and se allocators. The
> se_* and ep_* functions retrieve the allocator object from that thread local
> global variable and do their allocations. If a function wants to change the
> allocators for its own function calls, it can make a copy of the global variable
> pointers to local variables, set it's desired new allocators and restore the
> global pointers at the end, like:
>
> __thread void* se_allocator;
> __thread void* ep_allocator;
>
> void my_special_func()
> {
>   void* old_se = se_allocator;
>   void* old_ep = ep_allocator;
>
>   se_allocator = create_special_se();
>   ep_allocator = create_special_ep();
>   dissect_packet();
>   garbage_collect(se_allocator);
>   garbage_collect(ep_allocator);
>
>   se_allocator = old_se;
>   pe_allocator = old_ep;
> }

This would certainly require fewer changes to existing code. It is,
however, complicated by the lack of nice cross-platform API for
thread-local storage. I just took a quick look at glib's gestures in
this direction and they seem rather limited (a max of 128 thread-local
objects, and they can't be destroyed).

One of the things I was hoping to do by removing the allocators from
the global scope was make it very obvious what the exact scopes of the
various allocators are. Right now ep_ and se_ memory is used in a lot
of places that aren't correct because the code has simply been able to
grab the global. We really don't want anyone using ep_ memory unless
there is actually a packet being dissected, and scoping the ep_ pool
would enforce that nicely.

> Second point:
>
> as you've noticed especially in dissector we pass large number of arguments into
> almost every function (tvb, pinfo, tree). Bundling all of them into a struct
> like dissect_context would make adding things like custom allocators way easier
> to support, since we could add the required pointers to that context struct and
> only pass that struct into all of the functions. However now you would need to
> modify every line of code which was using the pointers for tvb, pinfo, tree and
> in the dissectors that is a very large number of lines.

Yes, it is a lot of work, but it doesn't have to happen all at once. I
was thinking it might be worthwhile to change the dissector_handle
structure so that the is_new boolean becomes an enum, allowing us to
add a third dissector prototype to the union. The price of
backwards-compatibility.

> Also it will be a
> difficult decision how the API on the new allocation functions should look like?
> Would we want to hand them the dissect_context struct or the allocator pointer?
>
> struct dissect_context {
>   tvbuff_t* tvb;
>   packet_info* pinfo;
>   void* se_allocator;
>   void* ep_allocator;
> };
>
> void* se_alloc(dissect_context* ctx); // like this?
> void* se_alloc(void* se_allocator); // or like this?

I'm pretty sure we want to pass the allocator pointer - otherwise
we're basically preventing use outside of dissectors, which we don't
want to do.

Thanks for your comments, you brought up a lot of good points.
Evan