Wireshark-dev: Re: [Wireshark-dev] libtshark + scripting language support

From: Mark Landriscina <mlandri1@xxxxxxx>
Date: Thu, 19 Aug 2010 09:02:37 -0400
Hmmm... I think that you, Guy, and Eloy have given me good cause to go back and rethink my approach. I really appreciate the detailed feedback. 

I'll take a closer look at your suggestion to pull together dissection capabilities as an independent lib. If you happen to have any notes/thoughts/past work on this sitting around and wouldn't mind sharing, I'd certainly love to see it.  

Thanks Emmanuel!

Regards,
Mark

----- Original Message -----
From: Thierry Emmanuel <Emmanuel.Thierry@xxxxxxxxxxxxxxx>
Date: Thursday, August 19, 2010 4:29 am
Subject: Re: [Wireshark-dev] libtshark + scripting language support
To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>

> Hello,
> 
> I share the same point of view than Guy.
> For my current project, I have integrated Wireshark in a monitoring 
> probe, which permit to have a dissection without additional work, and 
> having a fully detailed packets capture.
> 
> I have worked very differently than you, considering Wireshark as a 
> library itself. If you take a look at the epan directory, you'll see 
> that you have all the tools you need to decode any kind of packet.
> 
> You can :
> - Init the library with "epan_init" and "init_dissection" functions
> - Find a dissector with "dissector_table_foreach_handle" and 
> "dissector_handle_get_protocol_index" functions
> - Request the library to process your data against the protocol you 
> want with "call_dissector_only" (from layer 2 to 7, for example I 
> decode http or icmp packets as well)
> - Access to any part of the dissected packet with the ptree and finfo 
> structures
> 
> I succeeded to use this by getting my packets from simple SOCK_STREAM, 
> SOCK_DGRAM, SOCK_RAW sockets, or from a libpcap binding (which is also 
> accessible from Wireshark). So this library gives per self a fully 
> usable interface, you just have to compile with header files located 
> in epan/ directory and some other in the root directory, and to link 
> against libwireshark.so.
> So I don't think that you have simplified your work by wrapping tshark 
> into a new library.
> 
> But the debate is very interesting. When I registered to this list to 
> ask how I could use independently the dissection code, one said to me 
> that nobody would be able to help me because it wasn't a common use of 
> Wireshark. And I think that it would be a great plus to provide the 
> dissecting stuff as an independent library. The dissecting abilities 
> of Wireshark are really huge, and it would be enormous if it was 
> independent. It would multiply its possibilities. I have worked a bit 
> on this kind of use so I would be glad to give help and comments if 
> you opened such a project.
> 
> Best regards.
> 
> 
> -----Original Message-----
> From: wireshark-dev-bounces@xxxxxxxxxxxxx [ On Behalf Of Mark Landriscina
> Sent: mercredi 18 août 2010 21:37
> To: wireshark-dev@xxxxxxxxxxxxx
> Subject: Re: [Wireshark-dev] libtshark + scripting language support
> 
> Guy,
> 
> Only need to link to libtshark.a. No need to link to libwireshark, 
> etc. Tshark.c does actually contain a fair amount of other useful code 
> that ties all the dissection stuff nicely together. My original 
> approach was to just draw on libwireshark and libwiretap code directly 
> but found that I was simply rewriting a basic version tshark.
> 
> Reason for the named-pipe was that I wanted to launch several 
> instances of tshark from within Python have them doing different 
> things and then collect their dissections via separate data streams. 
> Writing the dissection data over a named pipe seemed like a clean, 
> painless way to do this. Additionally, I wanted a flexible way to 
> export the dissection data in the event that I decided to do something 
> else with this code such as compile libtshark as an executable 
> (tshark) instead of a lib. I'd still be able to have the tshark 
> executable export its dissection data to other applications in binary 
> form (as opposed to printing it out in pdml format and parsing text). 
> 
> 
> I'm still playing around with the code and different ideas, so pls 
> feel free to share any ideas for better approaches.
> 
> ----- Original Message -----
> From: wireshark-dev-request@xxxxxxxxxxxxx
> Date: Wednesday, August 18, 2010 3:00 pm
> Subject: Wireshark-dev Digest, Vol 51, Issue 22
> To: wireshark-dev@xxxxxxxxxxxxx
> 
> 
> > Send Wireshark-dev mailing list submissions to
> > 	wireshark-dev@xxxxxxxxxxxxx
> > 
> > To subscribe or unsubscribe via the World Wide Web, visit
> > 	
> > or, via email, send a message with subject or body 'help' to
> > 	wireshark-dev-request@xxxxxxxxxxxxx
> > 
> > You can reach the person managing the list at
> > 	wireshark-dev-owner@xxxxxxxxxxxxx
> > 
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Wireshark-dev digest..."
> > 
> > 
> > Today's Topics:
> > 
> >    1. Wiki weirdness? (Jeff Morriss)
> >    2. Re: Wiki weirdness? (Bill Meier)
> >    3. Re: Wiki weirdness? (Gerald Combs)
> >    4. libtshark + scripting language support (Mark Landriscina)
> >    5. Re: libtshark + scripting language support (Guy Harris)
> >    6. Re: libtshark + scripting language support (Eloy Paris)
> > 
> > 
> > ----------------------------------------------------------------------
> > 
> > Message: 1
> > Date: Wed, 18 Aug 2010 11:29:11 -0400
> > From: Jeff Morriss <jeff.morriss.ws@xxxxxxxxx>
> > Subject: [Wireshark-dev] Wiki weirdness?
> > To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> > Message-ID: <4C6BFC47.1060207@xxxxxxxxx>
> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> > 
> > 
> > The top part of the Wiki (that has a kind of tool bar with links to 
> 
> > the 
> > page's Info, etc.) has gotten "weird" for me: instead of lining up 
> > nicely the links are in a vertical list.
> > 
> > It looks the same on Firefox and IE and doesn't change if I'm logged 
> 
> > in 
> > or not.  Anyone else seeing this?
> > 
> > 
> > ------------------------------
> > 
> > Message: 2
> > Date: Wed, 18 Aug 2010 11:58:06 -0400
> > From: Bill Meier <wmeier@xxxxxxxxxxx>
> > Subject: Re: [Wireshark-dev] Wiki weirdness?
> > To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> > Message-ID: <4C6C030E.8000806@xxxxxxxxxxx>
> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> > 
> > Jeff Morriss wrote:
> > > The top part of the Wiki (that has a kind of tool bar with links 
> to 
> > the 
> > > page's Info, etc.) has gotten "weird" for me: instead of lining up 
> 
> > > nicely the links are in a vertical list.
> > > 
> > > It looks the same on Firefox and IE and doesn't change if I'm 
> logged 
> > in 
> > > or not.  Anyone else seeing this?
> > 
> > Yep ....
> > 
> > 
> > 
> > ------------------------------
> > 
> > Message: 3
> > Date: Wed, 18 Aug 2010 09:01:21 -0700
> > From: Gerald Combs <gerald@xxxxxxxxxxxxx>
> > Subject: Re: [Wireshark-dev] Wiki weirdness?
> > To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> > Message-ID: <4C6C03D1.3070406@xxxxxxxxxxxxx>
> > Content-Type: text/plain; charset=UTF-8
> > 
> > Bill Meier wrote:
> > > Jeff Morriss wrote:
> > >> The top part of the Wiki (that has a kind of tool bar with links 
> to 
> > the 
> > >> page's Info, etc.) has gotten "weird" for me: instead of lining 
> up 
> > 
> > >> nicely the links are in a vertical list.
> > >>
> > >> It looks the same on Firefox and IE and doesn't change if I'm 
> > logged in 
> > >> or not.  Anyone else seeing this?
> > > 
> > > Yep ....
> > 
> > It should be fixed now. I was experimenting with caching yesterday, 
> and
> > left a bad configuration in place.
> > 
> > 
> > ------------------------------
> > 
> > Message: 4
> > Date: Wed, 18 Aug 2010 13:34:55 -0400
> > From: Mark Landriscina <mlandri1@xxxxxxx>
> > Subject: [Wireshark-dev] libtshark + scripting language support
> > To: wireshark-dev@xxxxxxxxxxxxx
> > Message-ID: <7310dfcb3a2d.4c6be17f@xxxxxxxxxxxxxxxx>
> > Content-Type: text/plain; CHARSET=US-ASCII
> > 
> > Hi,
> > 
> > I'd like to contribute some work that I've done to the wireshark 
> > community and need some advice on the best way to do this, assuming 
> 
> > there is interest. If not, that would be good to know as well. I 
> > suspect that it might be best to fork this off as a separate project 
> 
> > vs. incorporating it directly into ongoing SVN builds.
> > 
> > My initial goal was to modify the tshark (command line wireshark) 
> and 
> > wrap it as a Python module. I wanted to expose tshark dissections as 
> 
> > Python objects during packet capture or capture file processing. In 
> 
> > addition this, I found that it was quite easy to extend this idea a 
> 
> > bit more, so that other scripting languages (in additional to 
> Python) 
> > could leverage the same code base. See below for details.
> > 
> > My motivation was that I wanted to do some work with Scapy and 
> needed 
> > to access application layer protocol dissections within Python 
> without 
> > re-writing all the dissection code already available in 
> > tshark/wireshark. 
> > 
> > This is what I have done to date (all Linux for now, but am porting 
> to 
> > Windows):
> > 
> > a. Modified tshark code base and compiled it as a library, 
> > libtshark.a. This is the original tshark executable, more or less, 
> > with some notable additions. In particular, after packet dissection, 
> 
> > the epan dissection tree data is copied off into another tree 
> > structure that I've defined. This t_dissect_node tree is then 
> > serialized and written out over a named-pipe. The name of the 
> > named-pipe is defined by the user at run-time. The code to 
> unserialize 
> > the t_dissect_node tree is also part of libtshark.a. Also, I have 
> > incorporated some additional helper code that makes tree navigation 
> 
> > easier. A function named 'run' is called to start tshark and accepts 
> 
> > as parameters tshark command line args. 
> > 
> > b. A compiled Python shared library, _tsharkPY.so. I used SWIG to 
> > generate the Python bindings. Hence one could take the SWIG 
> interface 
> > file that I wrote for Python (tsharkPY.i) and modify for use with 
> > other SWIG supported languages: Ruby, Java, etc.
> > 
> > c. tsharkPY.py is the Python module file created by SWIG, leverages 
> my 
> > tsharkPY.i SWIG interface file.
> > 
> > All the above is based off of the most recent SVN builds and 
> > generation of the two lib files above has been incorporated into the 
> 
> > existing Wireshark build process. Hence, all you have to do is run 
> > 'make' and you get libtshark.a and _tsharkPY.so. 'make install' puts 
> 
> > these files into your Python lib path as defined by libtool. I do 
> need 
> > some help tweaking this, however. Right now, libtool wants to put 
> > these in /usr/local/lib/python2.6/site-packages/. However, they need 
> 
> > to be placed in /usr/lib/python2.6/site-packages/. Any thoughts 
> (other 
> > than hard coding the correct path)?
> > 
> > Some basic Python code to use the Python module is as follows.
> > 
> > import tsharkPY
> > 
> > #fork tshark. tshark will publish its dissections to 'tshark_pipe' 
> > FIFO. Will read and dissect 3 packets from mycapfile.
> > tsharkPY.run(["python","-W", "tshark_pipe","-c","3","-r","mycapfile"])
> > 
> > #subscribe to 'tshark_pipe'FIFO
> > tsharkPY.subscribe("tshark_pipe")
> > 
> > packets = []
> > 
> > #grab packets one at a time from tshark and save them in 'packets' array
> > while(1):
> >     
> >     #get packet from "tshark_pipe" FIFO
> >     p = tsharkPY.get_next_packet("tshark_pipe")
> >     
> >     #check for closed pipe/EOF. break out of loop when detected.
> >     if(p is None):
> >         #unsubscribe from tshark_pipe FIFO. cleans up FIFO file and 
> 
> > does some other house keeping.
> >         tsharkPY.unsubscribe("tshark_pipe")
> >         break
> >     
> >     #create protocol set, array, and dictionary objects and make 
> them 
> > part of t_dissect_node object
> >     p.create_protocol_containers()
> >     
> >     #create dictionary containing field names of all the nodes in 
> the 
> > packet tree that has 'p' as its root.
> >     p.create_node_dict()
> >     
> >     #append t_dissect_node object to 'packets' array
> >     packets.append(p)
> > 
> > 
> > print "Protocol sets: unordered list of protocols found in packet."
> > for packet in packets:                          #iterate over array 
> of 
> > t_dissect_node trees. Each tree is one packet's worth of data.
> >     for proto in packet.protocol_set:           #iterate over each 
> > protocol name (string) in t_dissect_tree's protocol set object.
> >         print proto,                            #print protocol name
> >     print                                       #print extra line    
> 
> > 
> > print "\nProtocol array: ordered array of protocol-level 
> > t_dissect_node references."    
> > for packet in packets:                          #iterate over array 
> of 
> > t_dissect_node trees. Each tree is one packet's worth of data.
> >     for node in packet.protocol_array:          #iterate over 
> > t_dissect_node object references in packet's protocol array.
> >         if node.field_name is not None:         #if node.field_name 
> 
> > exists (is not NULL), print value                   
> >             print node.field_name,
> >     print
> >     
> > print "\nProtocol dictionary: hash table indexed by protocol name. 
> > provides access to t_dissect_node references for protocol level 
> nodes 
> > in dissection tree."
> > for packet in packets:                                          
> > #iterate over array of t_dissect_node trees. Each tree is one 
> packet's 
> > worth of data.
> >     d_keys = packet.protocol_dict.keys()                        
> #dump 
> > key list for packet's protocol_dict object
> >     for k in d_keys:                                            
> > #iterate over key valus
> >         node = packet.protocol_dict[k]                          #get 
> 
> > reference to each protocol level node in series
> >         if node is not None and node.field_name is not None:    #if 
> 
> > successful in retrieving node using current key, print node's field_name
> >             print node.field_name,
> >     print
> > 
> > print "\nPacket debug print"
> > for packet in packets:                          #iterate over array 
> of 
> > t_dissect_node trees. Each tree is one packet's worth of data.
> >     packet.print_tree()                         #print 
> t_dissect_node 
> > tree info for current packet
> > 
> > print "\nPacket data as Python char list."
> > for packet in packets:                                          
> > #iterate over array of t_dissect_node trees. Each tree is one 
> packet's 
> > worth of data.
> >     try:
> >         p = packet.first_child.next.last_child                  
> #find 
> > a node in tree that probably has data                 
> >         data_list = p.binary_blob                               #get 
> 
> > node data as a list of chars 
> >         print data_list                                         
> #print 
> > list
> > 
> >     except:
> >         pass                                                    
> > #ignore any exceptions thrown from above code
> >     
> > print "\nNode dictionary: dictionary that hashes all nodes in node 
> > tree by their field names (if defined). If duplicate field_names 
> > exist, only the first one encountered is added."
> > for packet in packets:                          #iterate over array 
> of 
> > t_dissect_node trees. Each tree is one packet's worth of data.
> >     d_keys = packet.node_dict.keys()            #dump key list
> >     for k in d_keys:                            #iterate over key list
> >         print k,                                #print each key 
> >     print "\n"
> >     
> > print "\nFind node by its field name. Looking for 'ip.dst_host' in 
> > second packet"
> > node = packets[1].node_dict['ip.dst_host']                           
>   
> >  #find node in second packet that has its field_name param set to 'ip.dst_host'.
> > if (node is not None):
> >     print node.field_name+" found! Showname is '"+node.showname+"'"  
>   
> >  #if found, print some stuff from t_dissect_node structure
> > print
> > 
> > 
> > 
> > ------------------------------
> > 
> > Message: 5
> > Date: Wed, 18 Aug 2010 11:05:37 -0700
> > From: Guy Harris <guy@xxxxxxxxxxxx>
> > Subject: Re: [Wireshark-dev] libtshark + scripting language support
> > To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> > Message-ID: <B766DD58-4AA7-42FE-8CF9-5B36656FFAF9@xxxxxxxxxxxx>
> > Content-Type: text/plain; charset=us-ascii
> > 
> > 
> > On Aug 18, 2010, at 10:34 AM, Mark Landriscina wrote:
> > 
> > > I'd like to contribute some work that I've done to the wireshark 
> > community and need some advice on the best way to do this, assuming 
> 
> > there is interest. If not, that would be good to know as well. I 
> > suspect that it might be best to fork this off as a separate project 
> 
> > vs. incorporating it directly into ongoing SVN builds.
> > > 
> > > My initial goal was to modify the tshark (command line wireshark) 
> 
> > and wrap it as a Python module. I wanted to expose tshark 
> dissections 
> > as Python objects during packet capture or capture file processing. 
> In 
> > addition this, I found that it was quite easy to extend this idea a 
> 
> > bit more, so that other scripting languages (in additional to 
> Python) 
> > could leverage the same code base. See below for details.
> > > 
> > > My motivation was that I wanted to do some work with Scapy and 
> > needed to access application layer protocol dissections within 
> Python 
> > without re-writing all the dissection code already available in 
> > tshark/wireshark. 
> > > 
> > > This is what I have done to date (all Linux for now,
> > 
> > ...which hopefully really means "all UN*X for now", so that it 
> largely 
> > Just Works on Solaris, *BSD, Mac OS X, HP-UX, etc.
> > 
> > > but am porting to Windows):
> > > 
> > > a. Modified tshark code base and compiled it as a library, 
> > libtshark.a. This is the original tshark executable, more or less, 
> > with some notable additions. In particular, after packet dissection, 
> 
> > the epan dissection tree data is copied off into another tree 
> > structure that I've defined.
> > 
> > The tshark executable image, by default, actually contains no code 
> to 
> > parse packets or to read capture files; it's linked with two 
> > dynamically linked libraries, libwireshark (which contains all the 
> > dissection code) and libwiretap (which contains all the capture-file 
> 
> > reading code).
> > 
> > What code other than that code is in your libtshark.a?  Or does 
> > anything linked with libtshark.a also have to be linked with 
> > libwireshark and libwiretap?
> > 
> > > This t_dissect_node tree is then serialized and written out over a 
> 
> > named-pipe. The name of the named-pipe is defined by the user at 
> > run-time. The code to unserialize the t_dissect_node tree is also 
> part 
> > of libtshark.a.
> > 
> > So what's the reason for the named pipe?
> > 
> > 
> > ------------------------------
> > 
> > Message: 6
> > Date: Wed, 18 Aug 2010 14:22:22 -0400
> > From: Eloy Paris <peloy@xxxxxxxxxx>
> > Subject: Re: [Wireshark-dev] libtshark + scripting language support
> > To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> > Message-ID: <4C6C24DE.3090309@xxxxxxxxxx>
> > Content-Type: text/plain; charset=UTF-8; format=flowed
> > 
> > Hi Mark,
> > 
> > On 08/18/2010 01:34 PM, Mark Landriscina wrote:
> > 
> > [...]
> > 
> > > My motivation was that I wanted to do some work with Scapy and needed
> > > to access application layer protocol dissections within Python
> > > without re-writing all the dissection code already available in
> > > tshark/wireshark.
> > 
> > I am not a Python guy but my understanding is that there is Python 
> > support in Wireshark trunk (perhaps in 1.4.x). Did you look into 
> that 
> > 
> > and determined that it wasn't good enough for what you need? Just curious.
> > 
> > > a. Modified tshark code base and compiled it as a library,
> > > libtshark.a. This is the original tshark executable, more or less,
> > > with some notable additions. In particular, after packet dissection,
> > > the epan dissection tree data is copied off into another tree
> > > structure that I've defined. This t_dissect_node tree is then
> > > serialized and written out over a named-pipe. The name of the
> > > named-pipe is defined by the user at run-time. The code to
> > > unserialize the t_dissect_node tree is also part of libtshark.a.
> > > Also, I have incorporated some additional helper code that makes tree
> > > navigation easier. A function named 'run' is called to start tshark
> > > and accepts as parameters tshark command line args.
> > 
> > Any reason you chose to integrate tshark instead of libwireshark, 
> > which 
> > is what does all the dissection work, as Guy mentioned? I would 
> guess 
> > 
> > that it is because it is easier to execute tshark than to fully 
> > integrate libwireshark, but then I don't understand why you need to 
> 
> > make 
> > tshark a library instead of just executing it from within Python.
> > 
> > I actually had a similar need and my approach was to interface with 
> 
> > libwireshark. You can check out my work at 
> > 
> > Cheers,
> > 
> > Eloy Paris.-
> > netexpect.org
> > 
> > 
> > ------------------------------
> > 
> > _______________________________________________
> > Wireshark-dev mailing list
> > Wireshark-dev@xxxxxxxxxxxxx
> > 
> > 
> > 
> > End of Wireshark-dev Digest, Vol 51, Issue 22
> > *********************************************
> ___________________________________________________________________________
> Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
> Archives:    
> Unsubscribe: 
>              
> ___________________________________________________________________________
> Sent via:    Wireshark-dev mailing list <wireshark-dev@xxxxxxxxxxxxx>
> Archives:    
> Unsubscribe: 
>