On Thursday, July 10, 2003, at 1:36 AM, Biot Olivier wrote:
From: Guy Harris
A scheme wherein the scanner did as little work as possible, and the
interpretation of tokens was done by, for example, code
associated with
a given FT_ type, might be more easily extensible.
But then you loose the flexibility of doing e.g., byte pattern
searches in a
FT_STRING.
What do you mean by "byte pattern searches"? Are you talking about
exact matches, or "contains", or regular-expression matches or
something such as that?
If you mean exact matches, then if you mean that
xxxp.name == 77:72:6f:6e:67:0d:72:69:67:68:74
would be interpreted as comparing "xxxp.name" with the text string
"77:72:6f:6e:67:0d:72:69:67:68:74" rather than as comparing it with the
string "wrong\rright", then
1) That raises the question of whether that might, in fact, be the
*right* interpretation - should
xxxp.name == 77:72:6f:6e:67:0d:72:69:67:68:74
be interpreted as comparing with "wrong\rright" but
xxxp.name == 77:72:6f:6e:67:0d:72:69:67:68:7x
be interpreted as comparing with "77:72:6f:6e:67:0d:72:69:67:68:7x",
merely because the first right-hand side is a valid byte string and the
second right-hand side isn't?
2) That *is* the current interpretation - if you specify, in a
comparison with an FT_STRING, something that could be interpreted as a
colon-separated list of hex byte values, it's interpreted as a string,
not a colon-separated list of hex byte values, at least according to a
test I just did.
3) If we *want* to interpret it as such a colon-separated list, the
code associated with FT_STRING could do so.
Why not use implicit typing of a search pattern if possible, and
require
explicit typing otherwise? Something enclosed with <"> or <'> means
string
lookup, require byte patters also to start with a colon, etc?
That'd be a change from the current behavior, again, in that, at
present, a leading colon makes no difference. If it *did* make such a
difference, then you now have
xxxp.name == 77:72:6f:6e:67:0d:72:69:67:68:74
comparing with "77:72:6f:6e:67:0d:72:69:67:68:74" and
xxxp.name == :77:72:6f:6e:67:0d:72:69:67:68:74
comparing with "wrong\rright"; it bothers me that one small character -
a character that isn't necessary when comparing an FT_BYTES field -
makes a significant difference.
If we want to *add* the ability to use byte strings when comparing with
FT_STRING values, I'd vote for having anything that looks like a byte
string being interpreted as such by the FT_STRING value parser (without
requiring a leading colon) unless it's enclosed in quotes, so if you
want to comapre with "77:72:6f:6e:67:0d:72:69:67:68:74" you'd have to
do
xxxp.name == "77:72:6f:6e:67:0d:72:69:67:68:74"