Wireshark-dev: Re: [Wireshark-dev] display filter scanner.l possible weirdness

From: João Valverde <j@xxxxxx>
Date: Tue, 23 Aug 2022 15:43:40 +0100


On 8/23/22 15:12, Richard Sharpe wrote:
On Tue, Aug 23, 2022 at 6:56 AM João Valverde <j@xxxxxx> wrote:
On 8/23/22 14:29, Richard Sharpe wrote:
On Tue, Aug 23, 2022 at 2:30 AM João Valverde <j@xxxxxx> wrote:
On 8/22/22 14:42, Richard Sharpe wrote:
Hi folks,

In trying to introduce my contexts approach for display filters to
handle embedded/recursive structures in 802.11 Information Elements
(TLVs) I came across this in epan/dfilter/scanner.l:

-----------------------------
-               ([.][-+[:alnum:]_:]+)+[.]{0,2} |
-[-+[:alnum:]_:]+([.][-+[:alnum:]_:]+)*[.]{0,2} {
+              ([.][-+[:alnum:]_]+)+[.]{0,2} |
+[-+[:alnum:]_]+([.][-+[:alnum:]_]+)*[.]{0,2} {
------------------------------

Basically, the original scanner allowed solons (:) in field names. I
had to change that since I needed to parse out colons separately in
the grammar. It almost looks like someone made a mistake and assumed
they needed ':]' in contexts where that was not necessary.

I do not believe anyone uses colons in filter strings and did not
think it was possible.

Does anyone think this will be a problem?
It is a problem because that regex also has to match things other than
fields. Bytes, MAC addresses, IPv6 addresses, those use colons.
Hmmm, I had not noticed that so I will have to find another way to do
what I want to do.

Is there a reason why you are not developing this on the master branch?
That is odd.
I am doing it in a private branch but why does that matter?
I know it is a private branch, but that branch is based on version 3.6.
That matters if you plan to contribute your code to the project.

And I urge you to come up with a design first that can garner some
support. Maybe you could explain what "protocol contexts" are. I fail to
see what makes contexts recursive.
In ieee802.11 now (or soon) there will be IEs that are contained in
other IEs. With IE defragmentation there can be quite a lot of them.

Some people want to be able to say: Find me all the packets that have
IE xyz but only when it is within IE abc.

Making that possible the simple way (by having different filter
strings for each possible context) is an enormous amount of work and
would bloat the code inordinately.

An alternative is to have those small number of places where it is
possible to have IEs inside IEs (and it will only be newer IEs
introduced in 802.11be) to add a context string that can be used to do
that filtering.

I get that but you can't expect to add new syntax for display filters
that only works with an 802.11 IE. A language extension needs to be more
generic than that. It needs to solve recursion for all dissectors, or at
least be widely useful.
Hmmm, this seems to be a different objection to your earlier
objection. In any event the same approach can be used for any
protocol. The dissector would add the context as needed.

I will write some more about this later.


OK. The objection is the same. If your language extension proposal is generic it should be relatively straightforward to define a "context" without mentioning 802.11 or IEs. That's a relevant use case but the scope of the discussion should be wider.