On Sun, 17 Dec 2000 18:37:32 -0500 (EST)
Ed Warnicke <hagbard@xxxxxxxxxxxxxxxxxxxx> wrote:
> I've been looking through the dfilter code a bit, and would like to
> propose some fairly substatial changes.
>
I agree that it does need a lot of work.
<snip>
> 2) [i ]
> Where i is the offset and the length is implied to be to
> the end of the RHS value given (or the length of the field
> types being compared).
>
>
> The problem with 2 is that the range operators [] doesn't
> actually bind to the LHS variable to which it appears to be
> attached, but rather binds to both the LHS (through the offset) and
> to the RHS (through the implied length of the RHS).
> I find this quite counter intuitive. Second it would
> be nice to use the range operator [] to be able to specify a
> single particular element in a sequence for use in a relation.
> For example it would be nice if
>
> bootp.hw.addr[3]
>
> actually refered to the element at offset 3 in the sequence of bytes
> making the variable boot.hw.addr, instead it appears to compare whatever the
> RHS value in the relation is to the bootp.hw.addr variable, starting at
> the third element and continueing out to the length of the RHS. This
> seems highly counterintuitive to me.
Part of my thinking on implementing that was the fact that the length (or
final offset - 1) in the slice operator on the LHS does not have to be specified
when there are a countable number of values on the RHS. You're right that it
did come out somewhat counterintuitive.
When testing a slice of an ethernet address, e.g., I could say:
bootp.hw.addr[0:3] == 00:00:f6
But since the RHS is countable by the computer, I could just say:
bootp.hw.addr[0] == 00:00:f6
I do agree that is is counterintuitive w/ regard to other programming languages,
but we need to come up with a way to code this. This is different than:
bootp.hw.addr[0:] == 00:00:f6
since [0:] would mean "from 0 to the end of the field", which would produce
6 bytes, whereas "00:00:f6" is only 3 bytes long. See the subtle difference?
> I would propose a move towards a python like standard for ranges through
> the following:
> 1) Ranges of the form [i] denote the element of the
> sequence at offset i, so
> bootp.hw.addr[3]
> would refer to the third byte in the bootp.hw.addr variable.
agreed.
>
> 2) Ranges of the form [i:] denote all elements in the sequence from
> the offset i to the end of the sequence, so
> bootp.hw.addr[3:]
> would refer to the all bytes in the bootp.hw.addr variable
> from the offset 3 to the last byte(inclusive) in the bootp.hw.addr
> variable.
agreed
>
> 3) Ranges of the form [i:j] denote the elements of the sequence from
> the offset i to the offset j-1 (if j is positive). So
> bootp.hw.addr[3:6]
> would denote the byte sequence from the byte at offset 3
> bootp.hw.addr to the fifth byte (inclusive) of the bootp.hw.addr
> variable.
This might be a point of contention; this might be a religious point.
Should j refer to length or final offset? What are the advantages/
disadvantages of both?
Most of the time, the RHS of a byte slice comparison will be countable
by the computer, so explicitly specifing a j argument wouldn't
be necessary (if we can come up with a good syntax for that).
Perhaps it's just me, but I find the [offset:length] slice easier to
comprehend, at least in the context of packet analysis, than
[start_offset:final_offset-1]. But this is subjective.
>
> 6) Range will be bound to the variable directly to their left.
> There will no longer be binding partially to the variable
> in a relation and partially to the value in a relation.
(see my note above about letting the computer doing the counting)
--gilbert