https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=6755
Summary: slow loading/processing of conversations with over
500k frames.
Product: Wireshark
Version: SVN
Platform: x86
OS/Version: All
Status: NEW
Severity: Enhancement
Priority: Medium
Component: Wireshark
AssignedTo: bugzilla-admin@xxxxxxxxxxxxx
ReportedBy: const.crist@xxxxxxxxxxxxxx
Cristian Constantin <const.crist@xxxxxxxxxxxxxx> changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #7718| |review_for_checkin?
Flags| |
Created attachment 7718
--> https://bugs.wireshark.org/bugzilla/attachment.cgi?id=7718
patch
Build Information:
wireshark 1.7.1 (SVN Rev 40560 from /trunk)
Copyright 1998-2012 Gerald Combs <gerald@xxxxxxxxxxxxx> and contributors.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Compiled (64-bit) with GTK+ 2.24.4, with Cairo 1.10.2, with Pango 1.28.4, with
GLib 2.28.6, with libpcap (version unknown), with libz 1.2.3.4, with POSIX
capabilities (Linux), with SMI 0.4.8, with c-ares 1.7.5, with Lua 5.1, without
Python, with GnuTLS 2.12.10, with Gcrypt 1.5.0, with MIT Kerberos, with GeoIP,
with PortAudio V19-devel (built Jul 20 2011 00:01:38), without AirPcap.
Running on Linux 3.1.0-1-amd64, with locale en_US.UTF-8, with libpcap version
1.1.1, with libz 1.2.3.4, GnuTLS 2.12.10, Gcrypt 1.5.0.
Built using gcc 4.6.1.
--
wireshark dumping and analysis of the packets uses
the concept of "conversations"; in case of the
{udp,tcp,sctp}/ip packets the conversations are identified
using the pair (ip_address, port) per protocol type.
for (sip) proxy to proxy, (sigtran) server to server aso traffic
the conversations can become quite large pretty quickly.
however it seems that wireshark does not do very well at
handling these large conversations at all. for example, when
using the svn version on the following architecture:
4xAMD Phenom(tm) II X4 940 Processor
4GB RAM
it takes around 15 minutes to load and display a pcap file
of about 600MB containing conversations having over 500k packets.
curiosity made me to do some profiling (using oprofile).
this is what I get:
CPU: AMD64 family10, speed 3e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 750000
samples % image name symbol name
342405 49.2224 libwireshark.so.0.0.1 conversation_lookup_hashtable
327297 47.0506 libwireshark.so.0.0.1 conversation_new
1036 0.1489 libglib-2.0.so.0.2800.6 g_hash_table_lookup
982 0.1412 libwireshark.so.0.0.1 guint8_pbrk
900 0.1294 libwireshark.so.0.0.1 serv_name_lookup
so, the suspects are:
* conversation_new()
* conversation_lookup_hashtable()
I had a look at both of them.
1. conversation_new()
conversations are stored in a hash table implemented using GHashTable from
glib.
however elements having _exactly_ the same key (in this case the ip:port,
protocol)
are kept in separate lists.
what I find really strange here is the folowing loop:
if (conversation) {
for (; conversation->next; conversation = conversation->next)
;
conversation->next = se_alloc(sizeof(conversation_t));
conversation = conversation->next;
} else {
conversation = se_alloc(sizeof(conversation_t));
}
if it is needed to add the new conversation always at the
end of the list than why not _storing_ a pointer to the
end of the list?
2. conversation_lookup_hashtable()
now, this loop looks suspicious to me:
if (match) {
for (conversation = match->next; conversation; conversation =
conversation->next) {
if ((conversation->setup_frame <= frame_num)
&& (conversation->setup_frame > match->setup_frame))
match = conversation;
}
if (match->setup_frame > frame_num)
match = NULL;
}
if it is needed to find a conversation based on some frame number,
why not having the conversation list sorted on frame numbers?
I attach a patch against svn trunk that tries to improve the problems described
above.
with the patch applied, on the same machine, the same file took approx. 25
seconds
to be decoded and displayed. here is how the profiling looks like after
applying the patch:
CPU: AMD64 family10, speed 3e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 750000
samples % image name symbol name
4601 4.6589 libwireshark.so.0.0.1 guint8_pbrk
4545 4.6022 libglib-2.0.so.0.2800.6 g_atomic_pointer_get
4484 4.5404 libglib-2.0.so.0.2800.6 g_hash_table_lookup
3252 3.2929 libwireshark.so.0.0.1 compute_offset_length
2824 2.8595 libc-2.13.so __memcpy_sse2
2771 2.8059 libwireshark.so.0.0.1 emem_free_all
2364 2.3938 libglib-2.0.so.0.2800.6 g_slice_free1
2116 2.1426 libwireshark.so.0.0.1 dissect_sip_common
2069 2.0950 libwireshark.so.0.0.1 emem_alloc_chunk
1931 1.9553 libwireshark.so.0.0.1 check_offset_length_no_exception
I have tested the patch with: udp traffic (sip), tcp traffic (http, ftp).
I have run the fuzz tests for the udp packets as well (6 passes, all O.K.)
pls. let me know what you think.
--
Configure bugmail: https://bugs.wireshark.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching all bug changes.