Ethereal-dev: Re: [Ethereal-dev] a modest proposal (range filtering RFC)

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Gilbert Ramirez <gram@xxxxxxxxxx>
Date: Mon, 18 Dec 2000 21:25:36 -0600
On Sun, 17 Dec 2000 18:37:32 -0500 (EST)
Ed Warnicke <hagbard@xxxxxxxxxxxxxxxxxxxx> wrote:

> I've been looking through the dfilter code a bit, and would like to 
> propose some fairly substatial changes.
> 

I agree that it does need a lot of work.

<snip>

> 2)	[i ] 
> 	Where i is the offset and the length is implied to be to 
> 	the end of the RHS value given (or the length of the field 
> 	types being compared).  
> 
> 
> The problem with 2 is that the range operators [] doesn't 
> actually bind to the LHS variable to which it appears to be 
> attached, but rather binds to both the LHS (through the offset) and 
> to the RHS (through the implied length of the RHS).
> I find this quite counter intuitive.  Second it would 
> be nice to use the range operator [] to be able to specify a 
> single particular element in a sequence for use in a relation.
> For example it would be nice if 
> 
> bootp.hw.addr[3] 
> 
> actually refered to the element at offset 3 in the sequence of bytes
> making the variable boot.hw.addr, instead it appears to compare whatever the 
> RHS value in the relation is to the bootp.hw.addr variable, starting at 
> the third element and continueing out to the length of the RHS.  This
> seems highly counterintuitive to me.

Part of my thinking on implementing that was the fact that the length (or
final offset - 1) in the slice operator on the LHS does not have to be specified
when there are a countable number of values on the RHS. You're right that it
did come out somewhat counterintuitive.

When testing a slice of an ethernet address, e.g., I could say:

bootp.hw.addr[0:3] == 00:00:f6

But since the RHS is countable by the computer, I could just say:

bootp.hw.addr[0] == 00:00:f6
I do agree that is is counterintuitive w/ regard to other programming languages,
but we need to come up with a way to code this. This is different than:

bootp.hw.addr[0:] == 00:00:f6

since [0:] would mean "from 0 to the end of the field", which would produce
6 bytes, whereas "00:00:f6" is only 3 bytes long. See the subtle difference?


> I would propose a move towards a python like standard for ranges through 
> the following:
> 1)	Ranges of the form [i] denote the element of the 
> 	sequence at offset i, so 
> 	bootp.hw.addr[3] 
> 	would refer to the third byte in the bootp.hw.addr variable.

agreed.

> 
> 2)	Ranges of the form [i:] denote all elements in the sequence  from
>  	the offset i to the end of the sequence, so 
> 	bootp.hw.addr[3:]
> 	would refer to the all bytes in the bootp.hw.addr variable 
> 	from the offset 3 to the last byte(inclusive) in the bootp.hw.addr 
> 	variable. 

agreed

> 
> 3)	Ranges of the form [i:j] denote the elements of the sequence from 
> 	the offset i to the offset j-1 (if j is positive).  So 
> 	bootp.hw.addr[3:6] 
> 	would denote the byte sequence  from the byte at offset 3 
> 	bootp.hw.addr to the fifth byte (inclusive) of the bootp.hw.addr
> 	variable.

This might be a point of contention; this might be a religious point.
Should j refer to length or final offset? What are the advantages/
disadvantages of both? 

Most of the time, the RHS of a byte slice comparison will be countable
by the computer, so explicitly specifing a j argument wouldn't
be necessary (if we can come up with a good syntax for that).

Perhaps it's just me, but I find the [offset:length] slice easier to
comprehend, at least in the context of packet analysis, than
[start_offset:final_offset-1]. But this is subjective.

> 
> 6)	Range will be bound to the variable directly to their left.
> 	There will no longer be binding partially to the variable 
> 	in a relation and partially to the value in a relation. 

(see my note above about letting the computer doing the counting)

--gilbert