[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

205.0. "Cryptography problem" by AURORA::HALLYB () Fri Jan 04 1985 13:53

Newsgroups: net.crypt
Path: decwrl!decvax!mcnc!unc!ulysses!mhuxr!ihnp4!ihuxi!trough
Subject: Crypt uniqueness
Posted: Wed Jan  2 19:25:43 1985

Suppose an encryption algorithm encodes an n byte string as another n byte
string, using an m byte key.  Further, suppose that the 2**(8*m) encodings are
distributed "uniformly" over the n byte output space (uniform as in "random
uniform", since the algorithm is evidently too complicated to analyze).  Now,
how likely is it that two keys yield the same encoded text?  The Birthday
Paradox (you only need about 24 people to have a 50% chance of shared
birthdays) applies here; I don't have the exact relation, but the 50%
probability point is roughly proportional to the square root of the size of the
space being sampled.  Thus, there should be 50% chance of duplicate keys if the
key length (m) is about half of the text length (n).  I'd be interested if
anyone can reproduce the math here accurately. 

				Chris Scussel
				AT&T Bell Labs
T.RTitleUserPersonal
Name
DateLines
205.1SPRITE::OSMANFri Jan 04 1985 16:324
As a side question:

	Is the VAX password encryption algorithm such that someone can
	demonstrate TWO passwords that both work ?
205.2TURTLE::GILBERTFri Jan 04 1985 20:1626
I'm confused about what probability was being requested, so I've decided to
rephrase the question.

Given 2^(8m) random numbers [one per key] uniformly distributed in the range of
1 thru 2^(8n) [the encoded text strings, as numbers], what is the probability
that some pair of random numbers are equal?

That is, what chance is there for the existance of two keys that encrypt a
given input string the same way?

Also, for what value of m is this probability roughly 1/2?

..
Note that the probability of a given pair of keys encrypting the given text the
same way is 1 in 2^(8n).

I believe the VMS password encryption algorithm can give the same encrypted
value for two different passwords (even if the algorithm uses the same 'salt'
in each).

Seeing these questions together makes me think that someone is looking for two
different passwords that both work.  If this is the case, please note that VMS
first compresses a password into an eight-byte string by adding bytes together,
in a wrap-around fashion, and then does massive crunching on these 64 bits.
It's fairly easy to find two passwords that produce the same wrap-around value;
methinks something like "HHHHHHHH" and "$$$$$$$$$$$$$$$$" will do it.
205.3ORPHAN::BRETTSun Jan 06 1985 12:025
Obviously the VAX p/w generator has this property, since there is 2^64 results
of encryption, and 38^31 p/w's (approx. 2^160).

/Bevin
205.4ADVAX::J_ROTHMon Jan 07 1985 02:2931
Let M = (2**8)**m and N = (2**8)**n.

The probability of M things selected from a set of N things all
being distinct is P = (N!/M!)/(N**M)... the standard 'birthday paradox'.

For N and M large Stirling's asymptotic approximation for N! can be used:

	N! = SQRT(2*PI*N)*(N/E)**N,

	LOG(N!) = .5*LOG(2*PI*N) + N*LOG(N) - N

Also, for small X,

	LOG(1-X) = -X - X**2/2 - X**3/3 - ...

where X will be equal to M/N.

Plugging in, dropping terms in LOG(1-M/N) beyond second order, and simplifying
brings

	LOG(P) = -(M**2)/(2*N)

so for a probability of .5,

	M = SQRT(-2*N*LN(.5)) = 1.177*SQRT(N)

... very close to the SQRT(N) estimate mentioned.  More terms could be
included in the series for LOG and N! to give a more accurate answer for
smaller N and M, or a ratio M/N not << 1, but this is the basic idea.

- Jim
205.5TAV02::NITSANMon Jan 07 1985 03:384
Ofcourse, if you use "block encryption" on long text, you can add some feed
back mechanism (e.g. xor previous output with next input or so). This will
prevent similar inputs from producing the same outputs, and is easy to
reconstruct.
205.6ORPHAN::BRETTMon Jan 07 1985 19:114
HHHHHHHH and $$$$$$$$$$$$$$$$ do indeed act as the same p/w on VMS.

/Bevin
205.7SPRITE::OSMANTue Jan 08 1985 15:0822
It turns out, it's very easy to find duplicate VMS passwords.  Your
password is stored encrypted into 64 bits, which is far less than necessary
to uniquely distinguish between all possiblities.  In particular, the
password is first subjected to the following compaction into 8 bytes.
Suppose the password is SWORDFISH_SCALES_FOR_SALE.  Passwords are expanded by
adding the ascii values into a quad_word, likewise:

	S W O R D F I S
	H _ S C A L E S
	_ F O R _ S A L
    +	E
	_______________
	a b c d e f g h

In this example, d is "R" + "C" + "R".  Hence with the above password,
you could, for instance, successfully log in by making the R two letters
less, namely P, and increase the C and R by one, making them D and S.

Therefore with password SWORDFISH_SCALES_FOR_SALE, you can just as "easily"
log in with "SWOPDFISH_SDALES_FOS_SALE".  Said another way, these two passwords
are equivalent, illustrating that there aren't as many passwords for a thief
to try as one might think !
205.8TURTLE::GILBERTTue Jan 08 1985 21:414
Even after the simple reduction of the password to a quadword and the "salt'
added in, I believe the hash function may still cause some different quadwords
to hash to the same value, as there's no apparent reason for the hash function
to just happen to provide a one-to-one mapping between quadwords.
205.9SPRITE::OSMANWed Jan 09 1985 15:2111
If you set your password to PATCHWORK, then you can alternatively type
WATCHWORD and it will still work !

If you set your password to GEOLOGIST, then you can alternatively type
NEOLOGISM and it will still work !

If you set your password to BLUNDERER, then you can alternatively type
PLUNDERED and it will still work !

If you set your password to DENOUNCER, then you can alternatively type
RENOUNCED and it will still work !
205.10AURORA::HALLYBWed Jan 09 1985 21:014
My goodness, Eric, you must have spent the whole morning pecking away!

					:-)
						John
205.11interesting matching passwords ?SIERRA::OSMANand silos to fill before I feep, and silos to fill before I feepFri Apr 11 1986 21:0716
    So far, the only duplicate VMS passwords we've seen are ones that
    come out the same modulo quadword.  Hence they'll necessarily
    come out the same when the quadword is then encoded.
    
    Can anyone find two strings that are DIFFERENT modulo quadword,
    but encode to the same thing ?  I've processed my 80000-word
    on-line dictionary, and no two of those words fit the bill.
    
    However, it may be that longer passwords will prove more fruitful.
    
    /Eric
    
    p.s.	Yeah, right John.  As a matter of fact, I just got
    		finished, so I didn't see your note until now.  Phew,
    		do my fingers *ache* !  :])
    
205.12CLT::GILBERTJuggler of NoterdomSat Apr 12 1986 03:3814
The hashing algorithm gives no guarantee that two different quadwords
will definitely give two different hashed values.  However, it does
spread hash them fairly well.

Suppose we started trying input quadwords in sequence; we get random
quadwords out.  Could someone familiar with the birthday problem
determine how many random numbers in the range 0 to 2^64-1 should
be chosen to have a 50% chance of a collision?

Offhand, I'd guess it's rather large, and that over 2^32 random quadwords
must be considered before there's a 50% chance of a collision.  Note
that it would only take about 32000 megabytes to store 2^32 quadwords.

					- Gilbert
205.13BEING::POSTPISCHILAlways mount a scratch monkey.Mon Apr 14 1986 20:5812
    Re .12:
    
    > Could someone familiar with the birthday problem determine how many
    > random numbers in the range 0 to 2^64-1 should be chosen to have a 50%
    > chance of a collision? 
    
    Solve for n:  gamma(2^64+1)/(gamma(2^64+1-n)*2^(64n)) = .5.  By taking
    the natural log of both sides, it should be fairly simple to solve the
    problem with numerical methods and an approximation for ln(gamma(x)).
     
    
    				-- edp
205.14BEING::POSTPISCHILAlways mount a scratch monkey.Tue Apr 15 1986 13:1749
    Re .12, .13:
    
    Unfortunately, the formula I gave involved 2^64+1-n which is difficult
    to solve numerically.  The most significant digit of solution for n is
    in the position of the tenth digit of 2^64, which makes the numbers
    too far apart to handle on most calculators.
    
    However, another way to write the formula gives a very simple solution.
    The probability of no matches after one guess is (1-0/2^64).  The
    probability of no matches after two guesses is (1-0/2^64)(1-1/2^64).
    After n guesses, the probability of no matches is
    
    	(1-0/2^64)(1-1/2^64)(1-2/2^64) . . . (1-(n-1)/2^64).
    
    We want this probability to equal one-half, so
    
    	i = n-1
    	product (1-i/2^64) = .5.
    	 i = 0                  
    
    Taking the natural logarithm of both sides yields
    
    	i = n-1
    	  sum    ln(1-i/2^64) = ln .5.
    	 i = 0                        
    
    For small values of x, ln 1+x is approximately x.  For the values
    we will be using, the approximation is good to at least ten places,
    according to my HP-41, so we can rewrite the equation as:
    
    	i = n-1
    	  sum   -i/2^64 = ln .5.
    	 i = 0
    
    The summation of integers from 0 to some number is well-known, so
    we have:
    
    	-((n-1)^2+(n-1))/(2*2^64) = ln .5.
    
    With a little algebra, this is equivalent to:
    
    	n^2 + n - 2^65 ln 2 = 0.
    
    (Note the change from ln .5 to ln 2, with a corresponding sign change.)
    The quadratic equation is simple to solve; it tells us n is
    5,056,937,541.
    
    
    				-- edp
205.15ENGINE::ROTHThu Apr 17 1986 02:256
    See the analysis in .4; it reaches the same conclusion.  Directly
    expanding the product as in .14 is a nice idea, for some strange reason
    it didn't occur to me at the time.  Either way rests on using the
    behaviour of log near 1 for these small probabilities.

    - Jim
205.16Used old problem to break in new CPUSQM::HALLYBAre all the good ones taken?Sat Jan 24 1987 19:2531
I thought I'd verify edp's results (.14) since I was a little concerned about
accumulated roundoffs, given the large number of trials required.  After 
6 days of CPU time (using H-floating arithmetic) on an 8700, I obtained 
the following numbers.  The data below answers the question "how many (N)
distinct guesses should I make in order to have probability (p) of guessing
a given hashed password, assuming perfectly flat hashing: 

  p		N
....	..............
.001	   192,124,822   
.01	   608,926,881   
.1	 1,971,577,271   
.2	 2,869,241,009   
.3	 3,627,531,229   
.4	 4,341,214,012   
.5	 5,056,937,540   	edp/J_ROTH estimated: 5,056,937,541
.6	 5,814,220,606   
.7	 6,664,739,783   
.8	 7,705,697,797   
.9	 9,216,853,900   
.95	10,512,992,586  
.99	13,034,599,788  
.999	15,964,059,241  

Probably an off-by-1 in the code.  With this many H-floating calculations
there was the potential for a really gross amount of accumulated roundoff
errors, too.  Interesting that the results of both the H-floating
multiplication and approximation of x ~= ln(1+x) came out so close, if not
exact. 

  John
205.17ENGINE::ROTHSun Jan 25 1987 13:0249
    I compared the other entries in John's table - close indeed!

    - Jim

	.001	   192,124,822 --    192124821.92
	.01	   608,926,881 --    608926881.23
	.1 	 1,971,577,271 --   1971577271.03
	.2 	 2,869,241,009 --   2869241008.63
	.3 	 3,627,531,229 --   3627531228.91
	.4 	 4,341,214,012 --   4341214011.75
	.5 	 5,056,937,540 --   5056937540.69
	.6 	 5,814,220,606 --   5814220606.06
	.7 	 6,664,739,783 --   6664739783.83
	.8 	 7,705,697,797 --   7705697797.50
	.9 	 9,216,853,900 --   9216853901.24
	.95	10,512,992,586 --  10512992586.66
	.99	13,034,599,788 --  13034599789.54
	.999	15,964,059,241 --  15964059242.89


#include stdio
#include math
main()
{
  double p[15] = {
    .001, .01, .1, .2, .3, .4, .5, .6, .7, .8, .9, .95, .99, .999 };
  char *jh[] = {
    ".001	   192,124,822",
    ".01	   608,926,881",
    ".1 	 1,971,577,271",
    ".2 	 2,869,241,009",
    ".3 	 3,627,531,229",
    ".4 	 4,341,214,012",
    ".5 	 5,056,937,540",
    ".6 	 5,814,220,606",
    ".7 	 6,664,739,783",
    ".8 	 7,705,697,797",
    ".9 	 9,216,853,900",
    ".95	10,512,992,586",
    ".99	13,034,599,788",
    ".999	15,964,059,241" };
  FILE *fp;
  int i;

  fp = fopen("crypto.dat", "w");
  for (i=0; i<14; i++)
    fprintf(fp, "%s -- %15.2f\n", jh[i], sqrt(-pow(2.0, 65.0)*log(1.0-p[i])));
  fclose(fp);
}
205.18for accuracy phreaksCLT::GILBERTeager like a childSun Jan 25 1987 14:1516
For additional accuracy, you could take (from note 205.14):

	n-1
	sum ln(1-i/2^64) = ln .5
	i=0                        

And since:

	ln(1+x) = x - x^2/2 + x^3/3 - x^4/4 + x^5/5 - x^6/6 + ...

You could use Eric's technique with the better approximation:

	ln(1+x) = x - x^2/2


Has anybody tried to find two different quadwords that hash to the same result?
205.19ENGINE::ROTHSun Jan 25 1987 17:077
    Re .-1, good point.

    The only reason I didn't use Eric's approx was because 2^-64 is already
    about .5E-19 and small compared to the digits of significance in JH's
    data.

    - Jim
205.20How to generate congruent passwordsDENTON::AMARTINAlan H. MartinFri Jan 30 1987 22:1211
Re .18:

>Has anybody tried to find two different quadwords that hash to the same result?

Are you forgetting that topic 114 of CLOSET::HACKERS (q.v.) tells how
to generate a congruent password from any other password of 16 characters
or less?  (Not to mention containing sufficient information to be able
to solve the problem for longer passwords, no doubt)?

After all, you participated in the discussion.
				/AHM
205.21CLT::GILBERTeager like a childSun Feb 01 1987 02:157
>Has anybody tried to find two different quadwords that hash to the same result?

>Are you forgetting that topic 114 of CLOSET::HACKERS (q.v.) tells how ...

Of course not.  I persuaded VMS to not use a CRC for the password hash
algorithm (as was used in V1 and V2 of VMS), and am quite familiar with
the the code.  The persuasion, by the way, dumped passwords at 9600 baud.
205.22you're confusing two different thingsVIDEO::OSMANand silos to fill before I feep, and silos to fill before I feepMon Feb 02 1987 14:2625
You're confusing the issue here.

The VMS password encryption scheme is done in TWO steps:

1)	Take user's clear-text and fold it into eight bytesm merely adding
	each successive character into next byte, and wrap after every
	eight.

2)	Use CRC to encode the resulting eight-byte value.

It's only step 1) in which I demonstrated that any short password produces
the same folded form as same password appended with UUUUUUUUUUUUUUUUVVVVVVVV.
(16 U's andd 8 V's)

As far as I know, no one has discovered any two DIFFERENT folded forms
that the CRC in step 2 maps to the same thing.

In fact, I vaguely recall reading that the CRC has the mathematical
property that no two folded forms will CRC to the same thing.

I verified (as mentioned either in here or in HACKERS, I forget which),
that from an 80000-word english dictionary, no two words CRC to the
same thing.

/Eric
205.23JON::MORONEYLegalize LibertyMon Feb 02 1987 18:5513
Lets reword the challenge:

Find 2 VMS passwords such that:

1) if you set your password to the first, you can sign on using the second;
   (that is, the encrypted form stored in the UAF file gives the same value)
2) "folding" the passwords (step 1 of .22) gives 2 different values.

This reduces the problem to the following:  Find 2 64 bit values, which when
run through the second part of the VMS password encryption algorithm (part 2 of
.22) give the same 64 bit value as output.

-Mike
205.24"insufficient data"SSDEVO::LARYMon Feb 02 1987 19:2812
Interesting to note that in all this discussion the actual hashing function
has never been given.

I assume that it isn't CRC, since the CRC instruction only produces 32 bits.
I also assume that it isn't in the nature of a closely held secret, since
depending on that kind of stuff for security never works.

My best guess would be that it is the DES 8-byte-munge algorithm, in which
case the compute and memory requirements of the problem are more than I can
deal with.

What is it?
205.25Not CRC (anymore), Not DES (too slow)TLE::BRETTTue Feb 03 1987 01:164
    Its a Purdy polynomial - basically a polynomial of order 2^64 with
    only a few coefficients that are large - computed modulo 2^64.
    
    /Bevin
205.26Background on password encryptionENGINE::ROTHTue Feb 03 1987 12:455
    See the August 1974 issue of Communications of the ACM, particularly

	A High Security Login Procedure, by G. B. Purdy

    - Jim
205.27TLE::BRETTWed Feb 04 1987 16:074
    Thanks Jim, I didn't know such an article existed and VMS's p/w
    hash has always intrigued me!
    
    /Bevin
205.28I've got a copy of the VMS codeVIDEO::OSMANand silos to fill before I feep, and silos to fill before I feepMon Feb 09 1987 13:295
I've got the VMS source code for the password encoding.  It's only a page
or two at the most.  I'll be glad to post it or forward it if folks
think that's "kosher".

/Eric
205.29CLT::GILBERTeager like a childMon Feb 09 1987 17:204
    I'd say it's "kosher" to distribute the sources -- the security
    does not rely on a 'secret algorithm'.  However, I'll ask that you
    not post the source code here (it's a bit largish) -- the hashing
    of the quadword can be described in just a couple equations.