Wireshark-dev: Re: [Wireshark-dev] display filter scanner.l possible weirdness

From: João Valverde <j@xxxxxx>
Date: Tue, 23 Aug 2022 10:30:01 +0100


On 8/22/22 14:42, Richard Sharpe wrote:
Hi folks,

In trying to introduce my contexts approach for display filters to
handle embedded/recursive structures in 802.11 Information Elements
(TLVs) I came across this in epan/dfilter/scanner.l:

-----------------------------
-               ([.][-+[:alnum:]_:]+)+[.]{0,2} |
-[-+[:alnum:]_:]+([.][-+[:alnum:]_:]+)*[.]{0,2} {
+              ([.][-+[:alnum:]_]+)+[.]{0,2} |
+[-+[:alnum:]_]+([.][-+[:alnum:]_]+)*[.]{0,2} {
------------------------------

Basically, the original scanner allowed solons (:) in field names. I
had to change that since I needed to parse out colons separately in
the grammar. It almost looks like someone made a mistake and assumed
they needed ':]' in contexts where that was not necessary.

I do not believe anyone uses colons in filter strings and did not
think it was possible.

Does anyone think this will be a problem?

It is a problem because that regex also has to match things other than fields. Bytes, MAC addresses, IPv6 addresses, those use colons.

Is there a reason why you are not developing this on the master branch? That is odd.

And I urge you to come up with a design first that can garner some support. Maybe you could explain what "protocol contexts" are. I fail to see what makes contexts recursive.

Also, are there automated tests for the dfilter stuff? I have been
using dftest to test my changes but it would be good to see if I have
disturbed anything.


Read README.test and try running "pytest -k dfilter" in the build directory.