[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference noted::decnis

Title: DEC Network Integration Server (DECNIS)
Notice:Please read note 1 to use this conference effectively
Moderator:MARVIN::WELCH
Created:Wed Sep 18 1991
Last Modified:Thu Jun 05 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:3660
Total number of notes:15082

3642.0. "Routing reports checksum error on IP packets" by RULLE::KLASSON (Sven-Olof Klasson @GOO) Fri May 16 1997 15:29

Hi,

A customer see the following event messages reported by DECNIS routers.

%%%%%%%%%%%  OPCOM   5-APR-1997 06:36:05.18  %%%%%%%%%%%
Message from user SYSTEM on GRIPEN
Event: IP Packet Discard from: Node DAGAB:.mal.MALNIS Routing,
        at: 1997-04-05-06:30:31.796+02:00Iinf
        Receiving Entity=Routing Circuit HUV-4-0,
        IP Header='4500002E861A00007F06FFFFC20E9F37C20E925B'H,
        IP Discard Reason=Incorrect Checksum

%%%%%%%%%%%  OPCOM   8-APR-1997 00:23:50.64  %%%%%%%%%%%
Message from user SYSTEM on GRIPEN
Event: IP Packet Discard from: Node DAGAB:.huv.HUVNIS Routing,
        at: 1997-04-08-00:18:29.655+02:00Iinf
        Receiving Entity=Routing Circuit SKE-8-0,
        IP Header='45000034E4EC00001D06FFFFC20E9583C20E9F37'H,
        IP Discard Reason=Incorrect Checksum

%%%%%%%%%%%  OPCOM  19-APR-1997 10:41:32.86  %%%%%%%%%%%
Message from user SYSTEM on GRIPEN
Event: IP Packet Discard from: Node DAGAB:.jor.JORNIS Routing,
        at: 1997-04-19-10:40:10.156+02:00Iinf
        Receiving Entity=Routing Circuit HUV-4-0,
        IP Header='45000029881A00007F06FFFFC20E9F3FC20E9058'H,
        IP Discard Reason=Incorrect Checksum

The events are seen quite seldom (approx. once/week). They are reported
be several DECNIS routers. All of them are running v3.1-6.
One thing in common is that the packets involved has passed one particular
DECNIS.

These messages look strange to me. A checksum error is detected by routing.
I expected that packets with checksum errors should have been taken care of
by lower layers, and that packet with checksum errors should never reach 
routing.

Is this a kown problem?
Could this be caused by a hardware error in a DECNIS where a packet becomes
corrupted while beeing stored in internal memory?

Sven-Olof Klasson, CSC Sweden
T.RTitleUserPersonal
Name
DateLines
3642.1Possible linecard software bugMARVIN::MCCLURETony McClure, DECnis Engineering RE02 FD9 7830-3564Mon May 19 1997 09:38161
Hi Sven-Olof,

	If you pipe the packets through the DTF program then you see
	what appears to be the problem


        IP Header='4500002E861A00007F06FFFFC20E9F37C20E925B'H,
        IP Header='45000034E4EC00001D06FFFFC20E9583C20E9F37'H,
        IP Header='45000029881A00007F06FFFFC20E9F3FC20E9058'H,


----------------------------------
      
IP Packet: IP
-------------

      
Type of service               
0x00
      
Precedence                    
Routine
      
Total length                  
46
      
Packet identifier             
0x861A
      
Fragment offset               
0x0000
      
Time to live                  
127
      
*** BAD CHECKSUM ***          
Packet=0xFFFF, Correct=0x0000
      
Source Address                
194.14.159.55
      
Destination Address           
194.14.146.91
      
Protocol                      
TCP
      
      
IP protocol: TCP

----------------------------------
      
IP Packet: IP
-------------

      
Type of service               
0x00
      
Precedence                    
Routine
      
Total length                  
52
      
Packet identifier             
0xE4EC
      
Fragment offset               
0x0000
      
Time to live                  
29
      
*** BAD CHECKSUM ***          
Packet=0xFFFF, Correct=0x0000
      
Source Address                
194.14.149.131
      
Destination Address           
194.14.159.55
      
Protocol                      
TCP
      
      
IP protocol: TCP


----------------------------------
      
IP Packet: IP
-------------

      
Type of service               
0x00
      
Precedence                    
Routine
      
Total length                  
41
      
Packet identifier             
0x881A
      
Fragment offset               
0x0000
      
Time to live                  
127
      
*** BAD CHECKSUM ***          
Packet=0xFFFF, Correct=0x0000
      
Source Address                
194.14.159.63
      
Destination Address           
194.14.144.88
      
Protocol                      
TCP
      
      

	So we can see that all three of these packet should have a checksum 
	the is 0x0000 but in all cases the checksum is 0xFFFF.

	You say that the one thing is common is that all these packets
	had passed through onw of the DECnis's. If this is true which 
	device was it?? (W622, W618 etc.)

	I think that with must be a bug in the linecard software. I will
	do some experiments here and see if I can re-produce the problem..

>>I expected that packets with checksum errors should have been taken care of
>>by lower layers, and that packet with checksum errors should never reach
>>routing.

	This is a IP header checksum, not a level 2 datalink checksum.
	Note there is also TCP header checksums too..

>> Could this be caused by a hardware error in a DECNIS where a packet becomes
>> corrupted while beeing stored in internal memory?

	No, I don't think so

	I will let you know what I discover.

Cheers Tony.
	

	




3642.2Bug in MPC softwareMARVIN::MCCLURETony McClure, DECnis Engineering RE02 FD9 7830-3564Mon May 19 1997 14:5021
Hi Sven-Olof,

	I have re-produced the problem is lab here.

	The problem is only present when the linecards give the packets to the
	MPC for forwarding, (for example when the ARP cache has timed-out)

	The MPC code doesn't accept 0xFFFF as a valid checksum for this
	packet. Please can you submit a CLD on this problem. 
	

Cheers Tony.
	

	





3642.3IPMT CFS.51358/SOO100969RULLE::KLASSONSven-Olof Klasson @GOOThu May 22 1997 11:5518
Hi Tony,

I have submitted a IPMT on this. IPMT case number is CFS.51358 SOO100969

I have one more question. I assume these packets are lost in the DECnis. What
happens when the packet is retransmitted?
The IP header should be identical in the retransmitted packet. Could the 
retransmitted packet be lost in the same way? Or is routing information in the 
linecards updated, so the linecards are able to handle this by themself without
involving the MPC.

The customer only see one of these OPCOM messages at a time. Next message of 
this type may come a week later. I think the retransmitted packet are not lost.
If that would be the case, there would be several OPCOM messages with a few 
seconds interval.

Thanks,
Sven-Olof
3642.4MARVIN::MCCLURETony McClure, DECnis Engineering RE02 FD9 7830-3564Fri May 23 1997 09:3531
Hi Sven-Olof,

>I assume these packets are lost in the DECnis. 
	
	Yes

>What happens when the packet is retransmitted? The IP header should be identical 
>in the retransmitted packet. Could the retransmitted packet be lost in the same way? 

	Yes this is correct


>Or is routing information in the linecards updated, so the linecards are able to 
>handle this by themself without involving the MPC.
>The customer only see one of these OPCOM messages at a time. Next message of 
>this type may come a week later. I think the retransmitted packet are not lost.
>If that would be the case, there would be several OPCOM messages with a few 
>seconds interval.


	I guess that what must happen is that another packet must arrive to be 
	sent to the same destination and this would cause the MPC to fix the
	ARP cache.

Cheers Tony