Wireshark · Ethereal-dev: Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)

Ethereal-dev: Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, some

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: Guy Harris <gharris@xxxxxxxxx>

Date: Wed, 13 Jul 2005 00:14:05 -0700

Peter Johansson wrote:

Pilz Rene wrote:
Peter Johansson wrote:
ronnie sahlberg wrote:
ok   jag avvaktar ett tag.

en idee,   det kan eventuellt vara fel i bittorrent dissectorn
eftersom det inte verkar vara nagra storre fel pa hur andra
dissectorer anvander reassemble.

kan du kolla att bittorrent dissectorn ENDAST anropar reassembly OMM
packetet verkligen ar fragmenterat.  DVS om hela PDUn redan finns i
paketet och darfor ingen reassembly behovs,  da skall inte reassembly
rutinerna anropas.


reassembly rutinerna ar lite risiga,   de var nar jag skrev
reassemble.c  mest ett snabbt hack  men jag fick aldrig tid att skriva
om koden battre :-(


On Sat, 22 Jan 2005 15:41:22 +0100, Peter Johansson
<Peter.Johansson@xxxxxxxxxxxx> wrote:
ronnie sahlberg wrote:
Hej,

Har du en exempel capturefil du kan skicka till mig sa kan jag titta
pa vad som ar fel.
Jag har två capturefiler, vardera om 20MB (jag körde capture till
roterande filer). Om först laddar den ena och sedan den andra kan man
reproducera krashen.
Jag håller på och felsöker och tror jag hittat en workaround för
krashen, tyvärr har jag ännu inte hittat den egentliga orsaken till
problemet.
Den info jag samlat på mig är:
1. Kraschen är ett faktum (förr eller senare) endast om den trafik
ethereal avkodar innehåller Bittorrent data, det kan alltså vara ett
problem med packet-bittorrent.c (har inte hunnit börja kika på den
filen), se också punkt 4.
2. Kraschen i reassemble.c som ju sker i ett anrop till memcpy sker
eftersom source (fd_i->data) inte verkar vara allokerat minne,
fd_i->flags anger bla att biten FD_NOT_MALLOCED är satt.
3. Kraschen inträffar endast när ethereal avkodar insamlat data, dvs
bara om man gör capture med "Update list of packets in realtime"påslaget.
4. Kraschen inträffar endast om Bittorrent avkodaren är med i listan
över "enabled protocols", därför tror jag att orsaken till kraschen
egentligen ligger i packet-bittorent.c.
Min workaround (som troligtvis ska permanentas även om felettroligtvis
ligger i packet-bittorrent.c) är att skriva om ett fåtal rader i
reassemble.c (se den bifogade bilden), det garanterar att man inte kan
krascha där för att man försöker hantera data som ej är allokerat. Så
vitt jag har förstått bör konsekvensen bli att ethereal kanske inte
avkodar just denna PDU på korrekt sätt. Man bör kanske därför lägga in
någon kod som kan tala om för användaren att så är fallet (kanske en
error popup där framenumret är angivet, etc). Därutöver skanaturligtvis
den egentliga orsaken till kraschen grävas fram.

Vidare tror jag att reassemble.c bör skrivas om såsom den bifogade
bilden visar. Dvs man ska aldrig anropa memcpy om src->flags eller
destination->flags har biten FD_NOT_MALLOCED satt.
Tyvärr rättar ju inte det orsaken till problemet, det ser bara tillattkraschen inte sker. Anledningen till att jag tidigare skrev att jaginte
visste vad jag skulle göra med problemet var att jag då inte kunde
reproducera det på ett kontrollerat sätt, det kan jag nu.
Är du fortfarande intresserad av filerna eller vill du avvakta?

Mvh Peter
I have not forgot about this, I have just been a little bit more busythan usual. I finally tracked down the problem though.
My conclusion is this:
Ethereal crashes in reassemble.c because reassemble.c copies data toa memory area that is not yet allocated (fd_i->flags has theFD_NOT_MALLOCED bit set). I have a solution to this (ensuring that acrash does not occur) which I will post once I have done somecleaning up.
The crash in reassemble.c occurs only as the result of a faultyprotocol dissector. In this case it is packet-bittorrent.c that isthe reason for the crash.The Bittorrent dissector registers only a heur-dissector (whichshould be fine). But once the heur test function detects that thisTCP stream is in fact Bittorrent data, it creates a conversation,making sure that all future data in the same TCP stream is decoded bythe Bittorrent protocol dissector without the use of the heur testfunction (this too should be fine I guess).The heur test function is capable of telling the calling frameworkwhether the PDU was in fact decoded by this dissector or not byreturning TRUE or FALSEpacket-bittorrent.c. The function that dissects Bittorrent data basedon the fact that it belongs to a conversation does not have theopportunity of telling the calling framework that it in fact cannotdecode the supplied PDU if necessary. And this is necessary in therare event that packet-tcp has marked the current PDU with "[TCPprevious segment lost]". In this case some data is missing but theBittorrent dissector still assumes that the first 4 bytes of the PDUdenotes the length of the PDU to be dissected. The problem now isthat since data was lost, the length is read using a random offsetinto the original Bittorrent packet (since some data was lost).My guess is that this could happen for any dissector that is calledsince the data belongs to a conversation created by the specificdissector when data has been lost.
Should packet-tcp perhaps not call higher level dissector when thePDU is marked with "[TCP previous segment lost]" or at least notperform the try_conversation_dissector(...) call?What would be the better way of ensuring that this does not happenwith any of the already existing dissectors?Should perhaps the API at hand for dissectors be changed so that whendecoding PDU data, the dissector would be able to return TRUE orFALSE in a similar way to the heur functions? This way, any dissectorwould be able to tell the lower layer dissector that although itshould have handled this PDU, it could not.
What is your opinion?

/ Peter

_______________________________________________
Ethereal-dev mailing list
Ethereal-dev@ethereal.
http://www.ethereal.com/mailman/listinfo/ethereal-dev


Unfortunately:

1) this completely breaks the feature wherein a TCP dissector, handed areassembled chunk of data, can indicate that it needs at least N morebytes of data to be added to the reassembled chunk, so that thereassembly has to be continued

and

2) I don't understand Swedish so I can't easily tell what the technicaldiscussion above says.

That feature is used by several dissectors, such as the HTTP dissectorwhich, when it's reassembling the entity headers of an HTTP request orresponse, keeps requesting more data until it sees the blank line at theend of the entity headers, at which point it says the reassembly iscomplete.


That feature is now broken because:

The "If it was already defragmented and this new fragment goes beyonddata limits" loop at the top of "fragment_add_work()" "undoes" thereassembly by pointing fragments that no longer have data, because itwas copied to the reassembled chunk and then freed, at the target of thecopy in the reassembled chunk, and sets the FD_NOT_MALLOCED flag onthose fragments.

The "we have received an entire packet, defragment it and free allfragments" code in "fragment_add_work()" saves the pointer to the oldreassembled chunk, allocates a new chunk to hold the reassembled data,and then falls into the "add all data fragments" loop.

The "add all data fragments" loop in "fragment_add_work()" then used tocopy *all* the fragments, regardless of whether FD_NOT_MALLOCED was seton the fragment or not, into the newly-allocated chunk. It now copiesonly the chunks with FD_NOT_MALLOCED set, and reports the others asbeing "Reassemble error"s.

This means that, in the reassemblies after the first reassembly, some ofthe data in the reassembled chunk is whatever just happened to be thereat the time of the allocation.

The old code *did* work correctly for some captures I have with HTTPtraffic in them - FD_NOT_MALLOCED doesn't mean "fd_i->data isn't valid",it means "it's not the address of a mallocated chunk, it's an address*in* a mallocated chunk".


What are the details of the cases where the old code *didn't* work?

It might be that the correct fix is to, in the "we have received anentire packet..." code, set "fd_head->data" to "g_realloc(fd_head->data,max)", which means that the data that was already copied there duringprevious reassemblies will still be there. However, we *still* need toget rid of the printout of the "Reassemble error" message, because it'sbogus to print that message every time we, for example, reassemble HTTPentity headers - which means we should really figure out why we're doingthat in cases where it *is* an error, and figure out where to fix that.

"tcp_dissect_pdus()" uses the "continue reassembly" feature - it firsttries reassembling the fixed-length portion of the PDU, so that the "getthe length" routine has enough of that portion to find out how large thepacket is, and then tries reassembling the entire packet, so if the4-byte header of a presumed BitTorrent packet is split across TCPsegments, that code path would be used.

One place where there's *definitely* a risk of problems is a presumedBitTorrent packet where the presumed length field is greater than2^32-5, so that when 4 is added to it we overflow and get a value *less*than 4. However, going back to at least 0.10.8, if the get_pdu_lenroutine called by "tcp_dissect_pdus()" returns a value less than thelength of the fixed-length portion of the PDU, that's assumed to be anoverflow, so it just shows a "Malformed packet" error and quits.

Follow-Ups:
- Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)
  - From: Peter Johansson

Prev by Date: SV: [Ethereal-dev] [PATCH] AIM dissector: Client Auto Response messages
Next by Date: [Ethereal-dev] Re: FT_GUID - use it in DCE RPC dissectors, etc.?
Previous by thread: Re: [Ethereal-dev] Ethereal.com's antivirus checking on wally.netisinc.com is broken
Next by thread: Re: Patch in reassemble.c (was Re: [Ethereal-dev] Crash in ethereal 0.10.8, somewhat reproducible)
Index(es):
- Date
- Thread