Ethereal-dev: Re: [Ethereal-dev] Proposed change to tethereal hex dump format
Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.
From: Ashok Narayanan <ashokn@xxxxxxxxx>
Date: Wed, 2 May 2001 17:40:56 -0400
> > 0 0010 7b2c 78c0 0010 7b2c 785d 0800 4500 ..{,x...{,x]..E. > > 10 0074 248f 0000 ff2e 7ebb 0a01 020f 0a01 .t$.....~....... > > 20 0201 1002 e3e8 ff00 0060 000c 0101 e600 .........`...... > > 30 0001 1100 000a 000c 0301 0a01 020f 0000 ................ > > > > > > 0000 00 10 7b 2c 78 c3 00 10 7b 2c 78 d5 08 00 45 00 ..{,x...{,x...E. > > 0010 00 74 46 53 00 00 ff 2e 5a e9 0a 01 03 10 0a 01 .tFS....Z....... > > 0020 03 0e 10 02 e4 67 ff 00 00 60 00 0c 01 01 e6 00 .....g...`...... > > 0030 00 01 11 00 00 0a 00 0c 03 01 0a 01 03 10 00 00 ................ > > This is more than 79 characters per line (82 if I calculate > correctly). Please don't make the lines longer than that they > still fit in a "standard" 80 characters wide window. This has already been addressed. The dump is only 72 characters wide. > > My reasons are: > > > > 1) It is a more standard hexdump format; we use it internally > > in Ethereal (GUI) as well. > > > > 2) This format is easier to deal with during parsing as well. > > I fail to see how the second format should be any easier to > parse than the first one. Unless you consider endianness... The problem lies in the determination of the offset versus the bytes themselves. In the proposed (standard?) format, you can differentiate between offset and bytes by simply looking at the length of the hex string - two characters is a byte, more than two is an offset. But in the earlier format, you quickly run into a situation where your offset and bytes are indistinguishable. Except for the fact that the offset is at the start of the line. But what if you don't have any offsets at all - just bytestrings? Also, it seems strange that the GUI Ethereal displays hexdumps in one format, and the text dump (or text printout) of the same hexdump appears in another format. A case could be made to unify these two formats at a minimum. > > It's a very small change to the code; I've tried it out. If > > this proposed change is made, then text2pcap will be able to > > read in a trace dumped by tethereal using -V -x, and be able > > to build a capture file out of the packets (minus the > > timestamps), a feature which I think is pretty cool. > > > > Thoughts? > > Preferably you should be able to parse both formats. There is > no reason to limit yourself to just one format when reading > in the file. True. I am trying to make this as flexible as possible. The tradeoff of flexibility is, how much context do you place on a value depending on its position in a line vs depending on it's format (two digits, more than two digits, etc.). I've tried to place less context on its position in the line; this allows for better processing of strange formats. The cost is that today it only works for individual bytes, not pairs of bytes. For example, text2pcap is able to extract the packet from this email without editing (don't need to remove the '> '). I've even tried stuff like prefixing eight '> ' forward marks, then sending the text through a word-wrapping email editor - text2pcap handles that as well. > Actually, you should be able to parse a number > followed by any number of two-digit hexnumbers (with or > without separating whitespace). Yeah, but you want slightly stronger rules in order to a) discard the text at the end of the bytestring, even if it contains hex digits, and b) actually use the offset for counting, which means you need to differentiate the offset from the bytes, which brings us to the above point. In point of fact, my parser does almost exactly what you mentioned. I recognize a line as optional prefix text, an offset, one or more bytes, and optional suffix text. Prefix and suffix text can include bytes which are ignored. The offset is used for counting as well as to indicate the start of a bytestring (or a new packet, if the offset is 0). In addition, the code will be capable of not using offsets altogether (not yet, though). -Ashok --- Asok the Intern ---------------------------------------- Ashok Narayanan IOS Network Protocols, Cisco Systems 250 Apollo Drive, Chelmsford, MA 01824 Ph: 978-244-8387. Fax: 978-244-8126 (Attn: Ashok Narayanan)
- References:
- RE: [Ethereal-dev] Proposed change to tethereal hex dump format
- From: Peter Kjellerstedt
- RE: [Ethereal-dev] Proposed change to tethereal hex dump format
- Prev by Date: RE: [Ethereal-dev] Proposed change to tethereal hex dump format
- Next by Date: [Ethereal-dev] Updates for wtls
- Previous by thread: Re: [Ethereal-dev] Proposed change to tethereal hex dump format
- Next by thread: RE: [Ethereal-dev] Proposed change to tethereal hex dump format
- Index(es):