[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

1024.0. "Checksum size for error correction/detection?" by DELNI::GOLDSTEIN (Room 101, Ministry of Love) Wed Feb 08 1989 13:55

    This is related to 539, 454 and 713, but not quite the same.
    
    Given a message of length N (which in my instant problem is 512
    bits), how can one determine the minimum Hamming distance of a cyclic
    code used for error detection and/or correction?  What procedures
    can be computed rapidly enough for a high-speed application?
    
    Here's the specific case I'm addressing.  A proposed network operates
    at burst speeds of over 100 Mbps, but all data is sent in 64-byte
    cells.  Individual user data streams are probably slower than that,
    but could approach FDDI rates.  
    
    The physical medium will provide low error rates, typically in the
    range of 10E-9 or better.  Errors come in two flavors, bit and long
    burst.  Long burst error last for milliseconds, and simply drop
    many successive cells, so only the first cell in a burst will have
    a valid address and scrambled data.  
    
    Bit errors affect one or at most (typically) two adjacent bits.
    However, scrambling may cause a given error to be repeated once at
    exactly d bits distance (where d may be 40 or in that magnitude). 
    Hence a single noise event will frog bit x plus bit x+40, which
    may be in that or the next cell.
    
    Will a 16-bit CRC be adequate to a) correct for 1-bit errors without
    scrambling; b) correct for 1-bit errors with scrambling (causing
    two 1-bit errors at a large distance); and c) detect all errors up to,
    say, 8 bits?  Or put differently, how big a CRC (or other checksum,
    say BCH code) is needed?  Thanx.
           fred
T.RTitleUserPersonal
Name
DateLines
1024.1SSDEVO::LARYOne more thin gypsy thiefThu Feb 09 1989 00:2052
I don't believe a 16-bit CRC will hack it. In a 512 bit message you can
have up to 512 distinct single-bit errors and up to about 128000 double-bit
errors (random ones, not two consecutive bits); since the 16-bit CRC can only
encode 65000 cases, it can't express all possible double-bit errors. Even
if you only used it to correct single-bit errors (which you could) and detect
double-bit errors, it would still be a code that is very easy to "fool" - 
about one out of 100 big error bursts would look OK to the code! No good.

I can think of three codes offhand that might get you what you want:

1)	A 122-bit Fire Code. This is an extension of the code used on older
	(RM and RP) series disk drives. It will correct up to a 41-bit
	burst (which includes two 1-bit bursts 40 bits apart), and is
	computed by a simple Linear Feedback Shift Register. I believe the
	correction hardware is fairly simple too, but not that fast - you have
	to shift the residue with feedback up to 512 times (the length of your
	message) looking for a certain pattern to appear. The length of the
	code makes it very hard to fool, too.

	This is, of course, fairly expensive, being its about 20% of the
	cell size...

2)	Just put a 32-bit CRC on each block, and then put check cells out
	after every N cells. This is similiar to a scheme we use on low-end
	tapes (TK50, TK70). For example, every 16 cells you would put out
	two check cells; one would contain the XOR of the even-numbered
	cells (0,2,4,6,8,10,12,14) and one would contain the XOR of the
	odd-numbered cells (1,3,5,7,9,11,13,15). The CRCs would have told you
	which cells were bad so you could XOR the other cells in the group
	with the bad cell together and reconstruct the bad cell. This adds
	about 6% overhead to each cell (the CRC) and another 12% in redundant
	cells.

3)	A pair of interleaved 32-bit Fire codes on each cell. The first
	code would cover bits 0-39, 80-119, 160-199, ..., 480-511 of
	the cell, the second would cover bits 40-79, 120-159, ..., 440-479
	of the cell. If you get a pair of errors 40 bits apart, they
	will tend to split themselves among the two groups and be treated
	as a pair of single-bit errors. A 32-bit Fire code can correct up to
	an 11 bit burst so its got plenty of correction coverage, you could
	even reduce it to a 17-bit code (an N bit Fire code corrects up
	to (N+1)/3 bits in error, but you don't want to go too small because
	of the chance of it being fooled by a large burst). This looks like
	the cheapest solution, its only about 6% overhead with two 17-bit
	codes.

	Fire codes are discussed pretty thoroughly in Pederson & Weldon,
	"Error Correcting Codes" - you might want to check them out there,
	I'm doing this from memory and I might be screwed up...

							Richie

1024.2Detect but don't correctPOOL::HALLYBThe smart money was on GoliathThu Feb 09 1989 15:207
    Given the nature of this product it would seem that the "correct"
    approach is to orient resources towards effective error detection.
    
    The ability to do forward correction is not as important as it is with
    media storage, given that retransmission is pretty simple.
    
      John