Yes I think in Ians first gprof data it was called an ungodly number of
times, like
700 milion times!
Let us check under what circumstances compare_proto_id() is used and for
what purpose.
Is it used to search for a value in the global list of hf values or is it
used to scan hf fields that are part
of the edt tree?
Is it only used to find hf values that are Protocols or is it used to find
any tope of hf field.
When we know how this function is used and where it is used we can look at
optimizing it.
If it is only used to find a pecific hf field matching a protocol
then, since the list of protocols is not that big (around 500 entries)
we should create a separate tree that we manage ourself using optimized
handcrafted routines to walk the tree
Since the list of hf_values are unlikely to change dynamically at runtime
and should in the normal case only
be created during startup, we could add code that after startup we create
a binary tree for all protocol hf entries.
spend some additional time to rearrange the tree so it is perfectly
balanced.
A perfectly balanced tree with <500 items should at most take 9 comparisions
to find the value we search for.
If we combine it with a hash, and make sure we use an intelligent algorithm
to automatically allocate hash buckets so
the hash is balanced we could use 256 hash buckets.
That would give us 256 trees of ~2 items each. Overkill for the case with
only 500 entries.
That would make the search complete within one hash calculation (to find the
tree) and 2 comparasions (to search the appropriate tree)
The setup to calculate buckets to get a perfect hash would be expensive
though but would only be needed once, during startup.
If we look at the entire list of all hf fields (10.000 items?) the savings
would be more significant.
A perfect hash with 256 buckets would be 40 items in each tree.
Thus only one hash calculation required and then 6 comparasions to find the
entry.
Since these things are central to ethereal and used a lot, I think we should
implement our own hash and binary tree implementation
instead of using something generic from say glib, so that we can optimize it
for maximum performance on our specific needs.
Your gprof data also indicates that a very significant amount of time is
spent in add_packet_to_pcket_list() which we might
look at optimizing at a later stage (after taking care of compare_proto_id()
I will look at compare_proto_id() how it is used and what it is used for
tomorrow.
(we are making real progress, we start to understand now where in the code
the slowness arise)
----- Original Message -----
From: "Richard Sharpe"
Sent: Sunday, November 16, 2003 10:00 AM
Subject: [Ethereal-dev] compare_proto_id taking up 20% of Ethereal's time
> Hi,
>
> Well, compare_proto_id is taking up something like 20% of the time when
> Ethereal loads a capture file or scans a capture file, it seems.
>
> However, compare_proto_id is very simple:
>
> static gint
> compare_proto_id(gconstpointer proto_arg, gconstpointer id_arg)
> {
> const protocol_t *protocol = proto_arg;
> const int *id_ptr = id_arg;
>
> return (protocol->proto_id == *id_ptr) ? 0 : 1;
> }
>
> So, it has to be the number of times it is called. It is passed to
> g_list_find_custom as a function for doing comparisons. This is probably
> being kept as a simple linked list that is scanned linearly and maybe we
> need a hash or some better function.
>
> Regards
> -----
> Richard Sharpe, rsharpe[at]ns.aus.com, rsharpe[at]samba.org,
> sharpe[at]ethereal.com, http://www.richardsharpe.com
>
> _______________________________________________
> Ethereal-dev mailing list
> Ethereal-dev@xxxxxxxxxxxx
> http://www.ethereal.com/mailman/listinfo/ethereal-dev
>