Wireshark-dev: Re: [Wireshark-dev] slow sip voip flow for large captures

From: Anders Broman <a.broman@xxxxxxxxxxxx>
Date: Sat, 04 Feb 2012 12:43:24 +0100
Please add your patch as it is so we can have a look at it.
Regards
Anders

Cristian Constantin skrev 2012-02-03 15:19:
On Fri, Feb 3, 2012 at 3:02 PM, Anders Broman
<anders.broman@xxxxxxxxxxxx>  wrote:
Hi,
Forget about my comment about the sip hash table.

Please add the patch(es) to bugzilla, if possible break it up to something we can commit straight away and stuff that needs to be worked on to work
For all taps as that can't be committed until it's done for all tap. Two bug reports?
cristian: but this means really 2 different patches which I have to
test separately, right?
otoh:

* the append operation on lists with over 10k elements is really cpu
intensive (imo it
does NOT actually make sense to iterate over 10k pointers just to get
to the end of
some linked list)

* if one does NOT apply that patch (step 3),
one cannot actually see the actual improvement.

I will open one bug report and post the results/patches there.

Regards
Anders
-----Original Message-----
From: wireshark-dev-bounces@xxxxxxxxxxxxx [mailto:wireshark-dev-bounces@xxxxxxxxxxxxx] On Behalf Of Cristian Constantin
Sent: den 3 februari 2012 14:54
To: Developer support list for Wireshark
Subject: Re: [Wireshark-dev] slow sip voip flow for large captures

On Fri, Feb 3, 2012 at 1:44 PM, Anders Broman<anders.broman@xxxxxxxxxxxx>  wrote:
Hi,
Please put the patches in bugzilla, if possible split them in two
parts one that's ok for all types(Step 2,3,4?) of calls and one which Implements the stuff for SIP but not for the other call types.
cristian: step 1 is dealing only with sip packets.

step 2's improvement could be used all voip taps.

step 3 is tricky since the lists have to be reversed (trivial operation for double linked lists; glib offers an api for this). however currently the prepend and reversing is performed only for sip calls. extra-work to be done here for other voip taps to use more efficient linked lists operations (prepend and eventl. reverse). also, this step modifies the callback for the button "Flow" of the "VoIP" window  - on_graph_bt_clicked() - to reverse (once) the lists:

* voip_calls_get_info()->callsinfo_list
* voip_calls_get_info()->graph_analysis->list

so, in order to work correctly with the other taps, they have to use the "prepend" operation/convention as well.

step 4 could be used for all taps

let me see what I can do about the patches.

I think the SIP dissector keeps a table over the calls as well could that table be shared?
cristian: I do not know which table is that.

Regards
Anders

-----Original Message-----
From: wireshark-dev-bounces@xxxxxxxxxxxxx
[mailto:wireshark-dev-bounces@xxxxxxxxxxxxx] On Behalf Of Cristian
Constantin
Sent: den 3 februari 2012 13:01
To: Developer support list for Wireshark
Subject: [Wireshark-dev] slow sip voip flow for large captures

hi!

wireshark can draw call flows for sip voip calls (accessible through the menu Telephony/VoIP Calls).

however, when the capture is large, containing tens of thousands of sip voip calls, wireshark becomes very slow at producing the list of calls and the call flows.

here are my experiences with a capture file of 500 MB containing approx. 40K calls on a debian machine having:

4xAMD Phenom(tm) II X4 940 Processor
4GB RAM

with the svn stock it takes about 10 hours to do it; the report of the profiling done with oprofile looks like this:

CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted
CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 750000 samples  %        image name
app name                 symbol name 8146520  55.2011  lt-wireshark
lt-wireshark
append_to_frame_graph.isra.7
3698478  25.0610  libglib-2.0.so.0.2800.6  lt-wireshark
g_list_last
1566933  10.6176  lt-wireshark             lt-wireshark
SIPcalls_packet
1162377   7.8763  libc-2.13.so             lt-wireshark
__strcmp_sse2
8734      0.0592  libwireshark.so.0.0.1    lt-wireshark
serv_name_lookup 8130      0.0551  libwireshark.so.0.0.1
lt-wireshark dissect_sip_common
6504      0.0441  libwireshark.so.0.0.1    lt-wireshark
format_text
5749      0.0390  libwireshark.so.0.0.1    lt-wireshark
tap_push_tapped_queue
4753      0.0322  libglib-2.0.so.0.2800.6  lt-wireshark
g_unichar_isprint
4627      0.0314  libglib-2.0.so.0.2800.6  lt-wireshark
g_hash_table_lookup
3699      0.0251  libwireshark.so.0.0.1    lt-wireshark
guint8_pbrk

step 1
------

by changing SIPcalls_packet() to use a hash table for the lookup of the calls (on sip call-id) the runtime decreases to an (estimated) 3-4 hours.
SIPcalls_packet()'s cpu usage has decreased from 10.61% to 0.0115%
(1000 times faster)

CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted
CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 750000 samples  %        image name
symbol name
411305   69.4523  lt-wireshark
append_to_frame_graph.isra.7
171549   28.9675  libglib-2.0.so.0.2800.6  g_list_last
475       0.0802  libwireshark.so.0.0.1    serv_name_lookup
439       0.0741  libwireshark.so.0.0.1    dissect_sip_common
375       0.0633  libwireshark.so.0.0.1    format_text 360
0.0608  libwireshark.so.0.0.1    tap_push_tapped_queue
303       0.0512  libglib-2.0.so.0.2800.6  g_hash_table_lookup
283       0.0478  libglib-2.0.so.0.2800.6  g_unichar_isprint
202       0.0341  libwireshark.so.0.0.1    guint8_pbrk
172       0.0290  libwireshark.so.0.0.1    proto_tree_add_item
149       0.0252  libwireshark.so.0.0.1    compute_offset_length
143       0.0241  libc-2.13.so             __memcpy_sse2
141       0.0238  libwireshark.so.0.0.1    call_dissector_work
141       0.0238  libwireshark.so.0.0.1    emem_free_all
115       0.0194  libc-2.13.so             __strncmp_sse2
112       0.0189  libc-2.13.so             vfprintf
105       0.0177  libwireshark.so.0.0.1    proto_item_add_subtree
104       0.0176  libglib-2.0.so.0.2800.6
g_slice_free_chain_with_offset
99        0.0167  libwireshark.so.0.0.1    dissect_sdp
91        0.0154  libc-2.13.so             free
91        0.0154  libwireshark.so.0.0.1    dissect_ip 90        0.0152
libwireshark.so.0.0.1    proto_field_is_referenced 90        0.0152
libwireshark.so.0.0.1    tvb_length_remaining 90        0.0152
lt-wireshark process_specified_packets.constprop.10
89        0.0150  libglib-2.0.so.0.2800.6  g_slice_free1
89        0.0150  libwiretap.so.0.0.1      file_seek
88        0.0149  libwireshark.so.0.0.1    dissect
86        0.0145  libwireshark.so.0.0.1    dissect_frame
82        0.0138  libwireshark.so.0.0.1    proto_is_protocol_enabled
79        0.0133  libc-2.13.so             __strcmp_sse2
79        0.0133  libc-2.13.so             _int_malloc
78        0.0132  libglib-2.0.so.0.2800.6  g_slice_alloc
75        0.0127  libpthread-2.13.so       pthread_getspecific
73        0.0123  libwireshark.so.0.0.1    tvb_find_guint8
72        0.0122  libc-2.13.so             malloc
71        0.0120  libwireshark.so.0.0.1    fast_ensure_contiguous
68        0.0115  lt-wireshark             SIPcalls_packet

step 2
------

the next improvment was to change append_to_frame_graph() to use a hash table for graph analysis (still slow, over 1 hour):

CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted
CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 750000 samples  %        image name
symbol name
565268   94.6185  libglib-2.0.so.0.2800.6  g_list_last
1863      0.3118  libwireshark.so.0.0.1    serv_name_lookup 1620
0.2712  libwireshark.so.0.0.1    dissect_sip_common 1340      0.2243
libwireshark.so.0.0.1    format_text
987       0.1652  libglib-2.0.so.0.2800.6  g_unichar_isprint
982       0.1644  libglib-2.0.so.0.2800.6  g_hash_table_lookup
819       0.1371  libwireshark.so.0.0.1    tap_push_tapped_queue
731       0.1224  libwireshark.so.0.0.1    guint8_pbrk
628       0.1051  libwireshark.so.0.0.1    proto_tree_add_item
587       0.0983  libglib-2.0.so.0.2800.6
g_hash_table_insert_internal
574       0.0961  libwireshark.so.0.0.1    compute_offset_length
405       0.0678  libwireshark.so.0.0.1    proto_is_protocol_enabled
401       0.0671  libwireshark.so.0.0.1    emem_free_all
395       0.0661  libwireshark.so.0.0.1    dissect_ip
391       0.0654  libc-2.13.so             __strncmp_sse2
367       0.0614  libwireshark.so.0.0.1    dissect ...
20        0.0017  lt-wireshark
append_to_frame_graph.isra.8

however, append_to_frame_graph() got from 69.45% to 0.0017% (40852
times better)

step 3
------

I have changed the graph analysis list and the callsinfo list to use prepend instead of append:

CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted
CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 750000 samples  %        image name
symbol name
130285   18.6875  libc-2.13.so             msort_with_tmp
66464     9.5333  libglib-2.0.so.0.2800.6  node_get_pos 53790
7.7154  libglib-2.0.so.0.2800.6  g_hash_table_insert_internal
51146     7.3361  libc-2.13.so             __memcpy_sse2
44419     6.3712  libglib-2.0.so.0.2800.6  node_get_next
42461     6.0904  libgtk-x11-2.0.so.0.2400.4 gtk_rbtree_reorder_fixup
41265     5.9189  libglib-2.0.so.0.2800.6  is_end
40819     5.8549  libglib-2.0.so.0.2800.6  g_hash_table_resize
38905     5.5803  libgtk-x11-2.0.so.0.2400.4 _gtk_rbtree_next
36069     5.1736  libgtk-x11-2.0.so.0.2400.4 _gtk_rbtree_reorder
35828     5.1390  libglib-2.0.so.0.2800.6  g_hash_table_lookup
29469     4.2269  libgtk-x11-2.0.so.0.2400.4 save_positions
26858     3.8524  libgtk-x11-2.0.so.0.2400.4 generate_order
7892      1.1320  libc-2.13.so             __memset_sse2
7752      1.1119  libgtk-x11-2.0.so.0.2400.4
gtk_rbtree_reorder_sort_func 6860      0.9840
libgtk-x11-2.0.so.0.2400.4 gtk_rbtree_reorder_invert_func
6835      0.9804  libglib-2.0.so.0.2800.6
g_hash_table_remove_all_nodes
4085      0.5859  libglib-2.0.so.0.2800.6  g_array_append_vals
3864      0.5542  libglib-2.0.so.0.2800.6  g_sequence_iter_is_end
3591      0.5151  libc-2.13.so             __memset_x86_64
2057      0.2950  libglib-2.0.so.0.2800.6  g_array_maybe_expand

step 4
------

reordering is performed _continously_ while adding elements to the voip calls list.
(by _gtk_rbtree_reorder which is calling qsort from libc which is
calling msort_with_tmp)

after disabling continous reordering, by using in voip_calls_dlg_update():

        /* disable the re-ordering */
        gtk_tree_sortable_set_sort_column_id(sortable,
GTK_TREE_SORTABLE_UNSORTED_SORT_COLUMN_ID, GTK_SORT_ASCENDING);

        listx = g_list_first(listx);
        while (listx) {
            add_to_list_store((voip_calls_info_t*)(listx->data));
            listx = g_list_next(listx);
        }

        /* enable the re-ordering */
        gtk_tree_sortable_set_sort_column_id(sortable,
CALL_COL_START_TIME, GTK_SORT_ASCENDING);

CPU: AMD64 family10, speed 3e+06 MHz (estimated) Counted
CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit
mask of 0x00 (No unit mask) count 750000 samples  %        image name
symbol name
13495     8.8003  libwireshark.so.0.0.1    format_text
10698     6.9763  libglib-2.0.so.0.2800.6  g_unichar_isprint
7421      4.8394  libwireshark.so.0.0.1    guint8_pbrk
4973      3.2430  libwireshark.so.0.0.1    compute_offset_length
4776      3.1145  libglib-2.0.so.0.2800.6  g_hash_table_lookup
4029      2.6274  libwireshark.so.0.0.1    dissect_sip_common
3851      2.5113  libglib-2.0.so.0.2800.6  g_atomic_pointer_get
3186      2.0776  libc-2.13.so             __memcpy_sse2
3064      1.9981  libglib-2.0.so.0.2800.6
g_hash_table_insert_internal
3008      1.9616  libwireshark.so.0.0.1    emem_free_all
2964      1.9329  libwireshark.so.0.0.1    fast_ensure_contiguous
2803      1.8279  libwireshark.so.0.0.1    serv_name_lookup
2793      1.8214  libwireshark.so.0.0.1
check_offset_length_no_exception
2114      1.3786  libc-2.13.so             _int_malloc
2063      1.3453  libglib-2.0.so.0.2800.6  g_slice_free1
2046      1.3342  libwireshark.so.0.0.1    tap_push_tapped_queue
1841      1.2005  libwireshark.so.0.0.1    tvb_get_guint8
1713      1.1171  libc-2.13.so             memchr
1668      1.0877  libwireshark.so.0.0.1    tvb_length_remaining
1625      1.0597  libglib-2.0.so.0.2800.6  g_atomic_int_get 1550
1.0108  libwireshark.so.0.0.1    proto_tree_add_item
1524      0.9938  libc-2.13.so             vfprintf
1475      0.9619  libwireshark.so.0.0.1    emem_alloc_chunk
1439      0.9384  libc-2.13.so             __strncmp_sse2
1395      0.9097  libglib-2.0.so.0.2800.6  g_slice_alloc
1312      0.8556  libwireshark.so.0.0.1    tvb_find_guint8
1279      0.8341  libwireshark.so.0.0.1
ensure_contiguous_no_exception
1052      0.6860  libc-2.13.so             malloc
1028      0.6704  libwireshark.so.0.0.1    dissect_sdp
1022      0.6665  libwireshark.so.0.0.1    tvb_pbrk_guint8

the runtime decreased now to approx. 1 minute (pretty reasonable when
compared to 10 hours)

I can publish the patches on bugzilla; however, the changes/tests have been explicitly targeted at sip voip calls.
some of the data structures are global and used by _all_ voip protocol taps; it won't work correctly with dumps having also non-sip voip calls.

pls. let me know your opinion.

thanks.
cristian
______________________________________________________________________
_____ Sent via:    Wireshark-dev mailing list
<wireshark-dev@xxxxxxxxxxxxx>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev

mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
______________________________________________________________________
_____ Sent via:    Wireshark-dev mailing list
<wireshark-dev@xxxxxxxxxxxxx>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev

mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list<wireshark-dev@xxxxxxxxxxxxx>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list<wireshark-dev@xxxxxxxxxxxxx>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list<wireshark-dev@xxxxxxxxxxxxx>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
              mailto:wireshark-dev-request@xxxxxxxxxxxxx?subject=unsubscribe