Ethereal-users: Re: [Ethereal-users] ASCII Dump?

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <gharris@xxxxxxxxx>
Date: Fri, 18 Nov 2005 11:51:41 -0800
Luke wrote:

What I'm looking for:
Just the TCP payload of a Kerberos packet, after ASN.1 decoding.

All Kerberos packets are ASN.1 encoded, to my knowledge. I'd rather not require users that will be using my tool to process these packets to have to download another tool that I've written to do the ASN.1 decoding of the packet, especially since Ethereal takes the ASN.1, interprets it correctly, and displays the Kerberos data, byte by byte, correctly, without any of the ASN.1 headers or ASN.1 information. Ethereal will be required anyway, and since it contains the functionality I need, I'm hoping to use it to do this particular type of packet capture. What I want to do is just have the Kerberos packet, without TCP/IP (and lower level) headers, after the ASN.1 has been decoded, dumped to a file.

Perhaps I'm misunderstanding how the ASN.1 encoding/decoding works. I was under the impression that ASN.1 added information to a data stream to support a correct transfer, and then that extra data was removed on the receiving side, leaving you with the data stream that was originally sent from the sender.

The "A" in "ASN.1" stands for "Abstract" - "ASN.1" is "Abstract Syntax Notation One". ITU-T Recommendation X.680, "Information technology – Abstract Syntax Notation One (ASN.1): Specification of basic
notation", says:

	This Recommendation | International Standard presents a standard
notation for the definition of data types and values. A data type (or type for short) is a category of information (for example, numeric, textual, still image or video information). A data value (or value for short) is an instance of such a type. This Recommendation | International Standard defines several basic types and their corresponding values, and rules for combining them into more complex
types and values.

In some protocol architectures, each message is specified as the binary value of a sequence of octets. However, standards-writers need to define quite complex data types to carry their messages, without concern for their binary representation. In order to specify these data types, they require a notation that does not necessarily determine the representation of each value. ASN.1 is such a notation. This notation is supplemented by the specification of one or more algorithms called encoding rules that determine the value of the octets that carry the application semantics (called the transfer syntax). ITU-T Rec. X.690 | ISO/IEC 8825-1, ITU-T Rec. X.691 | ISO/IEC 8825-2 and ITU-T Rec. X.693 | ISO/IEC 8825-4 specify three families of standardized encoding rules, called Basic Encoding Rules (BER), Packed Encoding Rules (PER), and XML Encoding Rules (XER).

The "Abstract" part refers to the fact that ASN.1 does *NOT* specify how data is represented "on the wire", it just specifies what types of objects are sent "on the wire". The encoding rules specify how particular types of objects are sent over the wire.

Kerberos uses the Basic Encoding Rules (or maybe one of the subsets thereof). Those encoding rules add tags to items specifying their types (so that the data is, if you will, "self-describing"), and also add length information (as, for example, numbers have a variable-length encoding). A data structure (which would be a SEQUENCE type) would also have a tag specifying the length of the structure, followed by encoded values of its member.

BER-encoding doesn't just take the data structures handed to the encoder and add in tag/length information; it might also *transform* the data, e.g. a 4-byte signed integral value might be encoded as a tag, a length, and 1 to 4 bytes of data (and a 4-byte *unsigned* integral value might require *5* bytes of data, so that the leading bits are zero). It's not as if stripping out the BER-encoding tags and lengths will give you back the exact data handed to the encoder.

It's also not a "data stream", it's structured data, and the tag information might be necessary to know what the structure was - for example, a data structure could have optional members, and the tags would be necessary to determine whether the members are present or not.

So how I'm hoping tethereal will fit into this idea is that I'm hoping tethereal can take the TCP or UDP packet, depending on what Kerberos decides to use, take only the payload, do the ASN.1 decoding, and dump the result to a file. The reason I was even mentioning ACSII before is that usually when I see dumps of this type, I see them in pcap format,

Well, if it's in pcap format, it's just a raw packet, with no decoding of any sort done.

whereas what I'm actually looking for is just a straight dump of bytes to a file. When that happens, some of those bytes should display as ASCII characters (for instance, kerberos packets will contain "krb"). Other characters will not display as nicely.

Note that I do not want anything other than the ASN.1 decoded (if I'm understanding this correctly) Kerberos packet - no dissection information, no Ethernet headers, no ARP address, no dissection information (i.e., this field is a flag, this field is a principal, etc.).

It would, in principle, be possible to just strip out the BER tagging information; the resulting data would be binary data, with only text string data being meaningful when interpreted as ASCII (and even that only if it's ASCII text, not text in some other character encoding, such as UTF-8, where only the ASCII characters in the string would be meaningful as ASCII).

That's not something Tethereal does, and not something I'd expect it ever to do; the dissector model isn't oriented towards stripping out arbitrary bits of data (and, at the level of Tethereal, it *is* arbitrary - the BER dissector routines, and their callers, know about the BER tags, but there's no higher-level notion, and dissectors are only intended to build protocol trees that turn into the detailed packet view (and that are used to evaluate display filters), and, at the level of the top-level Tethereal code, one protocol tree field is just like another.

What is it you're *really* trying to do here? I.e., what sort of processing is your tool doing to Kerberos packets? There might be a better form of input than a Kerberos packet with the BER tags and length bytes stripped out (especially given that the tags are *required* in order to correctly interpret the content bytes!). For example, getting the packets in PDML format, and parsing that (and ignoring in *that* parser the fields you're not interested in) might work better.