Wireshark-dev: Re: [Wireshark-dev] [Wireshark-commits] rev 40877: /trunk/epan/dissectors/ /trun

From: Jeff Morriss <jeff.morriss.ws@xxxxxxxxx>
Date: Thu, 09 Feb 2012 11:20:50 -0500
Joerg Mayer wrote:
On Wed, Feb 08, 2012 at 09:16:48AM -0500, Jeff Morriss wrote:
Joerg Mayer wrote:
[...]
So more than half of all the stuff is added by using proto_tree_add_text.
As long as the ratio is that way, people are likely to continue using it
inside this dissector.
Any volunteer(s) to get this down to some sane level by replacing it by
proto_tree_add_item and adding hf_ entries where possible to make these
Elements filterable?

Should something like the above check be added to one of the check scripts
to complain if the add_text percentage is above 10% or so?
Done in r40930

Good!

though there's a lot of dissectors with the problem so I chose 50% as the warning level for now.

Makes sense.

And the code isn't super fast either.

Why three loops: Just have one for proto_tree_add_[a-z]+. Use this for
totals and if there is a match, check whether is is proto_tree_add_text
and if it is, check for for var = proto_tree_add_text. That should be
"good enough" (or do we loose lots of multiline machtes by this).

I (have now) tried a one-loop approach (see attached) but it is, interestingly, many times slower than the 3-loop approach (I've seen from ~3-20 *times* slower). My regexps book gives some hints, when explaining non-capturing parenthesis, that capturing parenthesis can be expensive, but wow... (Then again, I guess there _are_ lots of variable assignments that lead to false matches of the first half of that expression; switching that to non-capturing and just capturing the whole assignment doesn't help though.)

I guess switching to a line-by-line search (like checkhf does) would help, even if it means we miss some multi-line matches; of course then it really would make sense to put it in checkhf like Alexis suggested.
--- tools/checkAPIs.pl	2012-02-08 21:12:36.000000000 -0500
+++ tools/checkAPIs.pl.new	2012-02-09 10:33:49.000000000 -0500
@@ -1307,26 +1307,25 @@
 {
 	my ($fileContentsRef, $filename) = @_;
 	my $add_text_count = 0;
-	my $okay_add_text_count = 0;
 	my $add_xxx_count = 0;
 
-	# First count how many proto_tree_add_text() calls there are in total
-	while (${$fileContentsRef} =~ m/ \W* proto_tree_add_text \W* \( /gox) {
-		$add_text_count++;
-	}
-	# Then count how many of them are "okay" by virtue of their generate proto_item
-	# being used (e.g., to hang a subtree off of)
-	while (${$fileContentsRef} =~ m/ \W* [a-zA-Z0-9]+ \W* = \W* proto_tree_add_text \W* \( /gox) {
-		$okay_add_text_count++;
-	}
-	# Then count how many proto_tree_add_*() calls there are
-	while (${$fileContentsRef} =~ m/ \W proto_tree_add_[a-z0-9]+ \W* \( /gox) {
+	my @matches = ${$fileContentsRef} =~ m/ \W* ([A-Za-z0-9_-]+ \W* =)? \W* (proto_tree_add_[a-z0-9_]+) \W* \( /gox;
+
+	while (@matches) {
+		my ($var, $func) = @matches;
+		shift @matches; shift @matches;
+		
+		if ($func eq "proto_tree_add_text") {
+			if (!$var) {
+				$add_text_count++;
+				next;
+			}
+		}
+
 		$add_xxx_count++;
 	}
 
-	#printf "add_text_count %d, okay_add_text_count %d\n", $add_text_count, $okay_add_text_count;
-	$add_xxx_count -= $add_text_count;
-	$add_text_count -= $okay_add_text_count;
+	#printf "add_text_count %d, add_xxx_count %d\n", $add_text_count, $add_xxx_count;
 
 	# Don't bother with files with small counts
 	if ($add_xxx_count < 10 || $add_text_count < 10) {