Wireshark-dev: Re: [Wireshark-dev] display filter scanner.l possible weirdness

From: Richard Sharpe <realrichardsharpe@xxxxxxxxx>
Date: Tue, 23 Aug 2022 07:12:11 -0700
On Tue, Aug 23, 2022 at 6:56 AM João Valverde <j@xxxxxx> wrote:
>
> On 8/23/22 14:29, Richard Sharpe wrote:
> > On Tue, Aug 23, 2022 at 2:30 AM João Valverde <j@xxxxxx> wrote:
> >> On 8/22/22 14:42, Richard Sharpe wrote:
> >>> Hi folks,
> >>>
> >>> In trying to introduce my contexts approach for display filters to
> >>> handle embedded/recursive structures in 802.11 Information Elements
> >>> (TLVs) I came across this in epan/dfilter/scanner.l:
> >>>
> >>> -----------------------------
> >>> -               ([.][-+[:alnum:]_:]+)+[.]{0,2} |
> >>> -[-+[:alnum:]_:]+([.][-+[:alnum:]_:]+)*[.]{0,2} {
> >>> +              ([.][-+[:alnum:]_]+)+[.]{0,2} |
> >>> +[-+[:alnum:]_]+([.][-+[:alnum:]_]+)*[.]{0,2} {
> >>> ------------------------------
> >>>
> >>> Basically, the original scanner allowed solons (:) in field names. I
> >>> had to change that since I needed to parse out colons separately in
> >>> the grammar. It almost looks like someone made a mistake and assumed
> >>> they needed ':]' in contexts where that was not necessary.
> >>>
> >>> I do not believe anyone uses colons in filter strings and did not
> >>> think it was possible.
> >>>
> >>> Does anyone think this will be a problem?
> >> It is a problem because that regex also has to match things other than
> >> fields. Bytes, MAC addresses, IPv6 addresses, those use colons.
> > Hmmm, I had not noticed that so I will have to find another way to do
> > what I want to do.
> >
> >> Is there a reason why you are not developing this on the master branch?
> >> That is odd.
> > I am doing it in a private branch but why does that matter?
>
> I know it is a private branch, but that branch is based on version 3.6.
> That matters if you plan to contribute your code to the project.
>
> >> And I urge you to come up with a design first that can garner some
> >> support. Maybe you could explain what "protocol contexts" are. I fail to
> >> see what makes contexts recursive.
> > In ieee802.11 now (or soon) there will be IEs that are contained in
> > other IEs. With IE defragmentation there can be quite a lot of them.
> >
> > Some people want to be able to say: Find me all the packets that have
> > IE xyz but only when it is within IE abc.
> >
> > Making that possible the simple way (by having different filter
> > strings for each possible context) is an enormous amount of work and
> > would bloat the code inordinately.
> >
> > An alternative is to have those small number of places where it is
> > possible to have IEs inside IEs (and it will only be newer IEs
> > introduced in 802.11be) to add a context string that can be used to do
> > that filtering.
> >
>
> I get that but you can't expect to add new syntax for display filters
> that only works with an 802.11 IE. A language extension needs to be more
> generic than that. It needs to solve recursion for all dissectors, or at
> least be widely useful.

Hmmm, this seems to be a different objection to your earlier
objection. In any event the same approach can be used for any
protocol. The dissector would add the context as needed.

I will write some more about this later.

-- 
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)(传说杜康是酒的发明者)