Wireshark-dev: Re: [Wireshark-dev] QtShark performance

From: Guy Harris <guy@xxxxxxxxxxxx>
Date: Wed, 25 Jan 2012 04:51:30 -0800
On Jan 25, 2012, at 4:34 AM, Guy Harris wrote:

> 
> On Jan 25, 2012, at 12:57 AM, Anders Broman wrote:
> 
>> I have had a quick look at QtShark loading a 150 Mb file is sligtly faster than the GTK version(~6s vs 8s
> 
> Note that the Qt version doesn't yet show a progress bar.  I'm experimenting with a different API for the progress bar, where you always call the "create" routine before the loop, and call the "update" routine on every trip through the loop and have it decide when to pop up the progress dialog, and, although I need to do more work on it, it might be that the extra procedure call is sufficient to slow down the file-loading process, so it might be that some of the difference between the Qt and GTK+ versions is less work done in deferred_create_progress_bar() and possibly even less work done in update_progress_bar().

...or perhaps it's just that if you call update_progress_bar() once per iteration of the loop, you have to generate the "XXX of YYY" progress line once per iteration of the loop, so that might have to be pushed into update_progress_bar().

Speaking of performance when loading, it spins a fair while on my 253 MB gzipped file (so it's a lot more than 253 MB when uncompressed) *after* it says that the file is 100% read in; when I run "sample" from inside Activity Monitor, the main thread is doing

    1939 Thread_10896974   DispatchQueue_1: com.apple.main-thread  (serial)
      1939 start
        1939 main
          1939 cf_read
            1939 new_packet_list_thaw
              1939 gtk_tree_view_set_model
                1939 gtk_tree_view_build_tree
                  1913 gtk_tree_model_iter_next
                    1912 packet_list_iter_next
                      1911 packet_list_iter_next
                      1 g_type_check_instance_cast
                    1 g_type_check_instance_is_a
                  22 _gtk_rbtree_insert_after
                    16 _gtk_rbnode_new
                      16 g_slice_alloc
                        15 thread_memory_magazine1_reload
                          15 magazine_cache_pop_magazine
                            14 slab_allocator_alloc_chunk
                              13 allocator_add_slab
                                10 allocator_memalign
                                  10 posix_memalign
                                    10 malloc_zone_memalign
                                      10 szone_memalign
                                        7 szone_free
                                          4 szone_free
                                          3 tiny_free_list_add_ptr
                                        2 szone_malloc_should_clear
                                          1 szone_malloc_should_clear
                                          1 tiny_malloc_from_free_list
                                        1 szone_memalign
                                3 allocator_add_slab
                              1 slab_allocator_alloc_chunk
                            1 __spin_lock
                        1 thread_memory_magazine1_alloc
                    3 _gtk_rbtree_insert_after
                    3 _gtk_rbtree_insert_fixup
                      2 _gtk_rbnode_rotate_left
                        2 _fixup_parity
                      1 _gtk_rbtree_insert_fixup
                  3 gtk_tree_model_ref_node
                    2 g_type_check_instance_is_a
                      2 type_node_conforms_to_U
                        2 type_node_check_conformities_UorL
                          2 type_lookup_iface_vtable_I
                    1 g_type_interface_peek
                      1 type_lookup_iface_vtable_I
                  1 gtk_tree_view_build_tree

so I wonder whether, if we *don't* freeze and thaw the packet list, the time to build the view's data structure could be spread out over the load process, so that it takes the same amount of time overall but doesn't have a big chunk of time at the end where it just seems to be spinning its wheels.  (I'm also curious whether any of the view's data structure could be eliminated, perhaps in favor of using the tree of frame_data structures, to save both time and memory.)