Wireshark-dev: Re: [Wireshark-dev] libtshark + scripting language support

From: Mark Landriscina <mlandri1@xxxxxxx>
Date: Wed, 18 Aug 2010 15:36:44 -0400
Guy,

Only need to link to libtshark.a. No need to link to libwireshark, etc. Tshark.c does actually contain a fair amount of other useful code that ties all the dissection stuff nicely together. My original approach was to just draw on libwireshark and libwiretap code directly but found that I was simply rewriting a basic version tshark.

Reason for the named-pipe was that I wanted to launch several instances of tshark from within Python have them doing different things and then collect their dissections via separate data streams. Writing the dissection data over a named pipe seemed like a clean, painless way to do this. Additionally, I wanted a flexible way to export the dissection data in the event that I decided to do something else with this code such as compile libtshark as an executable (tshark) instead of a lib. I'd still be able to have the tshark executable export its dissection data to other applications in binary form (as opposed to printing it out in pdml format and parsing text). 

I'm still playing around with the code and different ideas, so pls feel free to share any ideas for better approaches.

----- Original Message -----
From: wireshark-dev-request@xxxxxxxxxxxxx
Date: Wednesday, August 18, 2010 3:00 pm
Subject: Wireshark-dev Digest, Vol 51, Issue 22
To: wireshark-dev@xxxxxxxxxxxxx


> Send Wireshark-dev mailing list submissions to
> 	wireshark-dev@xxxxxxxxxxxxx
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	
> or, via email, send a message with subject or body 'help' to
> 	wireshark-dev-request@xxxxxxxxxxxxx
> 
> You can reach the person managing the list at
> 	wireshark-dev-owner@xxxxxxxxxxxxx
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wireshark-dev digest..."
> 
> 
> Today's Topics:
> 
>    1. Wiki weirdness? (Jeff Morriss)
>    2. Re: Wiki weirdness? (Bill Meier)
>    3. Re: Wiki weirdness? (Gerald Combs)
>    4. libtshark + scripting language support (Mark Landriscina)
>    5. Re: libtshark + scripting language support (Guy Harris)
>    6. Re: libtshark + scripting language support (Eloy Paris)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Wed, 18 Aug 2010 11:29:11 -0400
> From: Jeff Morriss <jeff.morriss.ws@xxxxxxxxx>
> Subject: [Wireshark-dev] Wiki weirdness?
> To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> Message-ID: <4C6BFC47.1060207@xxxxxxxxx>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> 
> The top part of the Wiki (that has a kind of tool bar with links to 
> the 
> page's Info, etc.) has gotten "weird" for me: instead of lining up 
> nicely the links are in a vertical list.
> 
> It looks the same on Firefox and IE and doesn't change if I'm logged 
> in 
> or not.  Anyone else seeing this?
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Wed, 18 Aug 2010 11:58:06 -0400
> From: Bill Meier <wmeier@xxxxxxxxxxx>
> Subject: Re: [Wireshark-dev] Wiki weirdness?
> To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> Message-ID: <4C6C030E.8000806@xxxxxxxxxxx>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Jeff Morriss wrote:
> > The top part of the Wiki (that has a kind of tool bar with links to 
> the 
> > page's Info, etc.) has gotten "weird" for me: instead of lining up 
> > nicely the links are in a vertical list.
> > 
> > It looks the same on Firefox and IE and doesn't change if I'm logged 
> in 
> > or not.  Anyone else seeing this?
> 
> Yep ....
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Wed, 18 Aug 2010 09:01:21 -0700
> From: Gerald Combs <gerald@xxxxxxxxxxxxx>
> Subject: Re: [Wireshark-dev] Wiki weirdness?
> To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> Message-ID: <4C6C03D1.3070406@xxxxxxxxxxxxx>
> Content-Type: text/plain; charset=UTF-8
> 
> Bill Meier wrote:
> > Jeff Morriss wrote:
> >> The top part of the Wiki (that has a kind of tool bar with links to 
> the 
> >> page's Info, etc.) has gotten "weird" for me: instead of lining up 
> 
> >> nicely the links are in a vertical list.
> >>
> >> It looks the same on Firefox and IE and doesn't change if I'm 
> logged in 
> >> or not.  Anyone else seeing this?
> > 
> > Yep ....
> 
> It should be fixed now. I was experimenting with caching yesterday, and
> left a bad configuration in place.
> 
> 
> ------------------------------
> 
> Message: 4
> Date: Wed, 18 Aug 2010 13:34:55 -0400
> From: Mark Landriscina <mlandri1@xxxxxxx>
> Subject: [Wireshark-dev] libtshark + scripting language support
> To: wireshark-dev@xxxxxxxxxxxxx
> Message-ID: <7310dfcb3a2d.4c6be17f@xxxxxxxxxxxxxxxx>
> Content-Type: text/plain; CHARSET=US-ASCII
> 
> Hi,
> 
> I'd like to contribute some work that I've done to the wireshark 
> community and need some advice on the best way to do this, assuming 
> there is interest. If not, that would be good to know as well. I 
> suspect that it might be best to fork this off as a separate project 
> vs. incorporating it directly into ongoing SVN builds.
> 
> My initial goal was to modify the tshark (command line wireshark) and 
> wrap it as a Python module. I wanted to expose tshark dissections as 
> Python objects during packet capture or capture file processing. In 
> addition this, I found that it was quite easy to extend this idea a 
> bit more, so that other scripting languages (in additional to Python) 
> could leverage the same code base. See below for details.
> 
> My motivation was that I wanted to do some work with Scapy and needed 
> to access application layer protocol dissections within Python without 
> re-writing all the dissection code already available in 
> tshark/wireshark. 
> 
> This is what I have done to date (all Linux for now, but am porting to 
> Windows):
> 
> a. Modified tshark code base and compiled it as a library, 
> libtshark.a. This is the original tshark executable, more or less, 
> with some notable additions. In particular, after packet dissection, 
> the epan dissection tree data is copied off into another tree 
> structure that I've defined. This t_dissect_node tree is then 
> serialized and written out over a named-pipe. The name of the 
> named-pipe is defined by the user at run-time. The code to unserialize 
> the t_dissect_node tree is also part of libtshark.a. Also, I have 
> incorporated some additional helper code that makes tree navigation 
> easier. A function named 'run' is called to start tshark and accepts 
> as parameters tshark command line args. 
> 
> b. A compiled Python shared library, _tsharkPY.so. I used SWIG to 
> generate the Python bindings. Hence one could take the SWIG interface 
> file that I wrote for Python (tsharkPY.i) and modify for use with 
> other SWIG supported languages: Ruby, Java, etc.
> 
> c. tsharkPY.py is the Python module file created by SWIG, leverages my 
> tsharkPY.i SWIG interface file.
> 
> All the above is based off of the most recent SVN builds and 
> generation of the two lib files above has been incorporated into the 
> existing Wireshark build process. Hence, all you have to do is run 
> 'make' and you get libtshark.a and _tsharkPY.so. 'make install' puts 
> these files into your Python lib path as defined by libtool. I do need 
> some help tweaking this, however. Right now, libtool wants to put 
> these in /usr/local/lib/python2.6/site-packages/. However, they need 
> to be placed in /usr/lib/python2.6/site-packages/. Any thoughts (other 
> than hard coding the correct path)?
> 
> Some basic Python code to use the Python module is as follows.
> 
> import tsharkPY
> 
> #fork tshark. tshark will publish its dissections to 'tshark_pipe' 
> FIFO. Will read and dissect 3 packets from mycapfile.
> tsharkPY.run(["python","-W", "tshark_pipe","-c","3","-r","mycapfile"])
> 
> #subscribe to 'tshark_pipe'FIFO
> tsharkPY.subscribe("tshark_pipe")
> 
> packets = []
> 
> #grab packets one at a time from tshark and save them in 'packets' array
> while(1):
>     
>     #get packet from "tshark_pipe" FIFO
>     p = tsharkPY.get_next_packet("tshark_pipe")
>     
>     #check for closed pipe/EOF. break out of loop when detected.
>     if(p is None):
>         #unsubscribe from tshark_pipe FIFO. cleans up FIFO file and 
> does some other house keeping.
>         tsharkPY.unsubscribe("tshark_pipe")
>         break
>     
>     #create protocol set, array, and dictionary objects and make them 
> part of t_dissect_node object
>     p.create_protocol_containers()
>     
>     #create dictionary containing field names of all the nodes in the 
> packet tree that has 'p' as its root.
>     p.create_node_dict()
>     
>     #append t_dissect_node object to 'packets' array
>     packets.append(p)
> 
> 
> print "Protocol sets: unordered list of protocols found in packet."
> for packet in packets:                          #iterate over array of 
> t_dissect_node trees. Each tree is one packet's worth of data.
>     for proto in packet.protocol_set:           #iterate over each 
> protocol name (string) in t_dissect_tree's protocol set object.
>         print proto,                            #print protocol name
>     print                                       #print extra line    
> 
> print "\nProtocol array: ordered array of protocol-level 
> t_dissect_node references."    
> for packet in packets:                          #iterate over array of 
> t_dissect_node trees. Each tree is one packet's worth of data.
>     for node in packet.protocol_array:          #iterate over 
> t_dissect_node object references in packet's protocol array.
>         if node.field_name is not None:         #if node.field_name 
> exists (is not NULL), print value                   
>             print node.field_name,
>     print
>     
> print "\nProtocol dictionary: hash table indexed by protocol name. 
> provides access to t_dissect_node references for protocol level nodes 
> in dissection tree."
> for packet in packets:                                          
> #iterate over array of t_dissect_node trees. Each tree is one packet's 
> worth of data.
>     d_keys = packet.protocol_dict.keys()                        #dump 
> key list for packet's protocol_dict object
>     for k in d_keys:                                            
> #iterate over key valus
>         node = packet.protocol_dict[k]                          #get 
> reference to each protocol level node in series
>         if node is not None and node.field_name is not None:    #if 
> successful in retrieving node using current key, print node's field_name
>             print node.field_name,
>     print
> 
> print "\nPacket debug print"
> for packet in packets:                          #iterate over array of 
> t_dissect_node trees. Each tree is one packet's worth of data.
>     packet.print_tree()                         #print t_dissect_node 
> tree info for current packet
> 
> print "\nPacket data as Python char list."
> for packet in packets:                                          
> #iterate over array of t_dissect_node trees. Each tree is one packet's 
> worth of data.
>     try:
>         p = packet.first_child.next.last_child                  #find 
> a node in tree that probably has data                 
>         data_list = p.binary_blob                               #get 
> node data as a list of chars 
>         print data_list                                         #print 
> list
> 
>     except:
>         pass                                                    
> #ignore any exceptions thrown from above code
>     
> print "\nNode dictionary: dictionary that hashes all nodes in node 
> tree by their field names (if defined). If duplicate field_names 
> exist, only the first one encountered is added."
> for packet in packets:                          #iterate over array of 
> t_dissect_node trees. Each tree is one packet's worth of data.
>     d_keys = packet.node_dict.keys()            #dump key list
>     for k in d_keys:                            #iterate over key list
>         print k,                                #print each key 
>     print "\n"
>     
> print "\nFind node by its field name. Looking for 'ip.dst_host' in 
> second packet"
> node = packets[1].node_dict['ip.dst_host']                             
>  #find node in second packet that has its field_name param set to 'ip.dst_host'.
> if (node is not None):
>     print node.field_name+" found! Showname is '"+node.showname+"'"    
>  #if found, print some stuff from t_dissect_node structure
> print
> 
> 
> 
> ------------------------------
> 
> Message: 5
> Date: Wed, 18 Aug 2010 11:05:37 -0700
> From: Guy Harris <guy@xxxxxxxxxxxx>
> Subject: Re: [Wireshark-dev] libtshark + scripting language support
> To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> Message-ID: <B766DD58-4AA7-42FE-8CF9-5B36656FFAF9@xxxxxxxxxxxx>
> Content-Type: text/plain; charset=us-ascii
> 
> 
> On Aug 18, 2010, at 10:34 AM, Mark Landriscina wrote:
> 
> > I'd like to contribute some work that I've done to the wireshark 
> community and need some advice on the best way to do this, assuming 
> there is interest. If not, that would be good to know as well. I 
> suspect that it might be best to fork this off as a separate project 
> vs. incorporating it directly into ongoing SVN builds.
> > 
> > My initial goal was to modify the tshark (command line wireshark) 
> and wrap it as a Python module. I wanted to expose tshark dissections 
> as Python objects during packet capture or capture file processing. In 
> addition this, I found that it was quite easy to extend this idea a 
> bit more, so that other scripting languages (in additional to Python) 
> could leverage the same code base. See below for details.
> > 
> > My motivation was that I wanted to do some work with Scapy and 
> needed to access application layer protocol dissections within Python 
> without re-writing all the dissection code already available in 
> tshark/wireshark. 
> > 
> > This is what I have done to date (all Linux for now,
> 
> ...which hopefully really means "all UN*X for now", so that it largely 
> Just Works on Solaris, *BSD, Mac OS X, HP-UX, etc.
> 
> > but am porting to Windows):
> > 
> > a. Modified tshark code base and compiled it as a library, 
> libtshark.a. This is the original tshark executable, more or less, 
> with some notable additions. In particular, after packet dissection, 
> the epan dissection tree data is copied off into another tree 
> structure that I've defined.
> 
> The tshark executable image, by default, actually contains no code to 
> parse packets or to read capture files; it's linked with two 
> dynamically linked libraries, libwireshark (which contains all the 
> dissection code) and libwiretap (which contains all the capture-file 
> reading code).
> 
> What code other than that code is in your libtshark.a?  Or does 
> anything linked with libtshark.a also have to be linked with 
> libwireshark and libwiretap?
> 
> > This t_dissect_node tree is then serialized and written out over a 
> named-pipe. The name of the named-pipe is defined by the user at 
> run-time. The code to unserialize the t_dissect_node tree is also part 
> of libtshark.a.
> 
> So what's the reason for the named pipe?
> 
> 
> ------------------------------
> 
> Message: 6
> Date: Wed, 18 Aug 2010 14:22:22 -0400
> From: Eloy Paris <peloy@xxxxxxxxxx>
> Subject: Re: [Wireshark-dev] libtshark + scripting language support
> To: Developer support list for Wireshark <wireshark-dev@xxxxxxxxxxxxx>
> Message-ID: <4C6C24DE.3090309@xxxxxxxxxx>
> Content-Type: text/plain; charset=UTF-8; format=flowed
> 
> Hi Mark,
> 
> On 08/18/2010 01:34 PM, Mark Landriscina wrote:
> 
> [...]
> 
> > My motivation was that I wanted to do some work with Scapy and needed
> > to access application layer protocol dissections within Python
> > without re-writing all the dissection code already available in
> > tshark/wireshark.
> 
> I am not a Python guy but my understanding is that there is Python 
> support in Wireshark trunk (perhaps in 1.4.x). Did you look into that 
> 
> and determined that it wasn't good enough for what you need? Just curious.
> 
> > a. Modified tshark code base and compiled it as a library,
> > libtshark.a. This is the original tshark executable, more or less,
> > with some notable additions. In particular, after packet dissection,
> > the epan dissection tree data is copied off into another tree
> > structure that I've defined. This t_dissect_node tree is then
> > serialized and written out over a named-pipe. The name of the
> > named-pipe is defined by the user at run-time. The code to
> > unserialize the t_dissect_node tree is also part of libtshark.a.
> > Also, I have incorporated some additional helper code that makes tree
> > navigation easier. A function named 'run' is called to start tshark
> > and accepts as parameters tshark command line args.
> 
> Any reason you chose to integrate tshark instead of libwireshark, 
> which 
> is what does all the dissection work, as Guy mentioned? I would guess 
> 
> that it is because it is easier to execute tshark than to fully 
> integrate libwireshark, but then I don't understand why you need to 
> make 
> tshark a library instead of just executing it from within Python.
> 
> I actually had a similar need and my approach was to interface with 
> libwireshark. You can check out my work at 
> 
> Cheers,
> 
> Eloy Paris.-
> netexpect.org
> 
> 
> ------------------------------
> 
> _______________________________________________
> Wireshark-dev mailing list
> Wireshark-dev@xxxxxxxxxxxxx
> 
> 
> 
> End of Wireshark-dev Digest, Vol 51, Issue 22
> *********************************************