Guy Harris wrote:
On Oct 30, 2003, at 7:32 AM, Jason House wrote:
I have thought about SQL dumps in the past... I think that a good
starting point might be to try to consider how to dump a particular
protocol to a database (via one or more linked tables).
By "dump a protocol" do you mean "dump every field for that protocol"?
I suspect people might want to selectively dump particular fields.
I did mean "dump every field for that protocol" ... the harder problem?
I suspect that if the problem is restricted to thinking about dumping
individual fields that some of the generality will be lost. I have seen
tools that are based solely off a database representation of packets
(but unfortunately have not seen under the hood).
They might also want some row of the table to contain fields from more
than one protocol; perhaps it's *fields* that should be dumped, not
*protocols*, at least in some cases. Dumping all the fields might also
be useful.
I agree that for a filter such as "ip && udp" and dumping ip and udp (or
a handful of fields from each) would benefit from putting everything in
a single row of a table. If the scope is expanded to be all protocols
in a capture file, then the number of columns/rows will exceed the
limits of programs like excel... not to mention that it would be a huge
file filled with mostly NULL's since every protocol in a capture isn't
contained in every packet.
As is true with everything, there is no easy way to mass process all
protocols... The generation of usable SQL tables requires
customization by protocol.
Hopefully "customization" doesn't involve changing any dissectors. I
think it'd be ideal if SQL dumping could be done *without* protocol
changes, so that dissector writers don't have to anticipate what people
might want to put into databases.
Let me answer with questions ;)
What % of the protocols that ethereal dissects...
...place every registered field in every dissected packet? (ie udp)
...place at most one copy of a field in every dissected packet?
(udp would qualify if it wasn't for udp.port)
...contain variable length lists of fields? (ie. ospf)
...have many fields that are sometimes used? (ie. ospf's lsa's)
I suspect that if the SQL dump is restricted to handle _fields_
analogous to the second question, and dump them into a single table,
then the implementation is a piece of cake and can be dissector
independent.
Of course, most databases are not a single table. There is a big
difference. Now if the implementation should cover all fields of all
protocols, then I suspect that there are certain protocols (such as
OSPF) and odd scenarios (such as the same protocol occurring more than
once in the tree) where any non-invasive implementation will fault. I
am not ruling out that there is no better way to encode information in a
dissector such that all SQL information can be obtained automatically...
I am just saying that the current approach does not provide a lot of
critical information.
I am now thinking about SQL stuff in general and I have a question. Is
all this stuff really about dumping data into a database or is it really
about being able to use SQL statements for filtering and summarizing data?
Maybe I am also thinking of too grand of an implementation. The filter
expressions, after all, also have related limitations. Try, for
instance, to write a filter where all fields filtered for are within the
same header (such as a single OSPF LSA or the rare cases where a TCP
session can be included inside another TCP session)