Ethereal-dev: Re: [Ethereal-dev] Help Conversation
Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.
From: Guy Harris <gharris@xxxxxxxxxxxx>
Date: Sun, 18 Mar 2001 19:29:07 -0800
On Sun, Mar 18, 2001 at 05:50:22PM +0100, Guillaume Le Malet wrote: > I've tried to understand how "conversation" in ethereal works, > and I've got a few questions: Note that the questions you ask don't actually pertain to the conversation mechanism; you don't have to set up a conversation in order to attach per-frame data to a frame, for example. > -When we do: "frame_data = p_get_proto_data(pinfo->fd, proto_smtp)" > does it point at the same time on all smtp packets that where > captured? I'm not certain what you mean here. The data a dissector would get from a "p_get_proto_data()" call is the data that the dissector attached to the frame in the first pass through the packets - "frame_data" would point to whatever object the dissector attached to the frame with a "p_add_proto_data()" call. > -What does CRLF, EOM and Hash Table mean? CRLF refers to a carriage-return character ('\r', octal 15, ASCII CR) followed by a line-feed character ('\n', octal 12, ASCII LF); there are a number of text-oriented protocols that use TCP - SMTP, as documented in RFC 821, is one such protocol, and some others are FTP, NNTP, HTTP, and POP - and those protocols tend to have the client sending commands to the server, where a command is a line of text with a CRLF at the end of the line, and have the server send replies back to the client, where the reply also contains one or more lines. EOM, in the SMTP dissector, refers to the End Of the Message. If an SMTP client tells a SMTP to send a mail message, the sequence of commands and replies might look something like this - "client:" and "server:" indicate who's sending the command or reply, and everything after it is the contents of one line (ending with a CRLF): client: MAIL FROM:<gharris@xxxxxxxxxxxx> server: 250 OK client: RCPT TO:<ethereal-dev@xxxxxxxxxxxx> server: 250 OK client: DATA server: 354 Start mail input; end with <CRLF>.<CRLF> client: From: Guy Harris <gharris@xxxxxxxxxxxx> client: To: ethereal-dev@xxxxxxxxxxxx client: Subject: Rewriting Ethereal in Objective COBOL client: Date: Sun, 1 Apr 2001 12:00:00 -0700 client: Message-ID: <20010401666666.A666@xxxxxxxxxxxxxxxxxxxxxx> client: X-Tagline: Poisson d'Avril client: client: Hey, I just had a really odd idea - what if we rewrote client: Ethereal in Objective COBOL? (I'm not sure that's what client: called, but there really *is* work being done on an client: object-oriented version of COBOL; they really should client: have called it "ADD ONE TO COBOL".) client: . server: 250 OK The "MAIL" command from the client to the server tells the server who's sending the mail; the server replies to that command with a reply "250 OK", where "250" is the reply code saying that the command was accepted, and the "OK" is, from the point of view of the protocol, just a comment for use by a person reading a transcript of the session. The "RCPT" command tells the server to whom the mail should be sent. The "DATA" command tells the server that the client is ready to supply the actual contents of the mail message; the "354 Start mail input..." reply says that the client should now send the contents of the mail message. The client then sends the headers and the body of the mail message. The way the client tells the server that it's finished sending the body of the mail message is to send a line consisting only of a "." character, i.e. it sends a "." character followed by a CRLF. That line is called an end-of-message, or an EOM, in the SMTP dissector. A hash table is a data structure used to speed up the process of searching for data items that have "keys" associated with them. For example, you might have a table of records about people, and the "key" would be the person's name. One way to find the record for a particular person would be to have a linked list of all those records, and to look at all the records, starting with the first one in the list, and comparing the user-name portion of the record with the name of the person for whom you're looking. If you have, say, 5 people for whom you have records, that wouldn't be too bad. If, however, you had 5,000 people, you would, on average (assuming a random distribution of names for which you're trying to find the record), have to look at 2,500 records each time you tried to find a record. That's somewhat expensive. In a hash table, instead of having one list, you have several lists. You would take the name and "hash" it (from the online Merriam-Webster's Collegiate(R) Dictionary: 1 Main Entry: hash Pronunciation: 'hash Function: transitive verb Etymology: French "hacher", from Old French "hachier", from "hache" battle-ax, of Germanic origin; akin to Old High German "hAppa" sickle; akin to Greek "koptein" to cut -- more at CAPON Date: 1590 1 a : to chop (as meat and potatoes) into small pieces ... ), by taking the characters in the name and, for example, adding them together and taking the result modulo the number of lists you've set up. You would put the record for a given user into the appropriate list, and, when searching for that person's record, you'd use the "hashed" version of their name to choose which of those lists to search. If you had 100 lists, for example, instead of having to search a list of 5,000 people, you would (assuming your "hash function" was roughly equally likely to pick any value between 0 and 99) only have to search a list of approximately 50 people. See http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=hash+table The GLib library includes routines to create hash tables, put entries into hash tables, remove entries from hash tables, and look up entries in hash tables; Ethereal uses them in a number of places. One place it uses them is in the SMTP dissector. In this case, the "key" is the conversation to which the current frame belongs, and the data is state information indicating what the next traffic on the SMTP connection is expected to be. This is necessary because, although some protocols allow a packet to be analyzed without knowing what packets came before it on the network, SMTP doesn't. For example, a line containing MAIL FROM:<billg@xxxxxxxxxxxxx> could either be a "MAIL" command *or* it could be part of a mail message explaining how SMTP works. > -Is there a CRLF in any "over TCP proto" messages? TCP provides, to the protocols that run over it, a sequenced byte stream; if the protocol running over TCP needs that byte stream to be considered as a sequence of messages, the protocol in question has to put into the byte stream data to specify when one message ends and the next one begins. Many protocols that run over TCP are "line-oriented" protocols; I listed some above. In those protocols, a line often corresponds to a message, so most (possibly all) messages would end with a CRLF. (In the case of SMTP, the data supplied after a "DATA" command is an exception - it ends not with a CRLF, but with a "." on a line by itself, i.e. a "." preceded by and followed by a CRLF.) However, not *all* of the protocols that run over TCP are line-oriented; some of them might, for example, begin a message with a count of the number of bytes in the message. You might, in such a message, have a byte with the value octal 15 followed by a byte with the value octal 12, but it wouldn't be a "CRLF" in the sense that "CRLF" is used in, say, SMTP - it wouldn't indicate the end of a one-line message in the protocol.
- References:
- [Ethereal-dev] Help Conversation
- From: Guillaume Le Malet
- [Ethereal-dev] Help Conversation
- Prev by Date: [Ethereal-dev] Help n°2 : Conversation
- Next by Date: [Ethereal-dev] Re: [Ethereal-dev] Help n°2 : Conversation
- Previous by thread: [Ethereal-dev] Help Conversation
- Next by thread: [Ethereal-dev] Help n°2 : Conversation
- Index(es):