[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

1411.0. "Minimal instruction sequences for register rearrangement" by EAGLE1::BEST (R D Best, sys arch, I/O) Wed Apr 03 1991 17:21

Minimal instruction sequences for register rearrangement on VAX
---------------------------------------------------------------

A common programming trick is to use XOR to exchange the contents of
two registers without another register:

;	Exchange R0 and R1 without auxiliary storage
;
;	Inputs:		R0 = x
;			R1 = y
;
;	Outputs:	R0 = y
;			R1 = x

XCHG:	XORL3	R0, R1, R1	; R1 <- x XOR y
	XORL3	R1, R0, R0	; R0 <- (x XOR y) XOR y = x
	XORL3	R0, R1, R1	; R1 <- x XOR (x XOR y) = y

[ In practice, VAX has two operand instructions that could accomplish
this specific operation using 3 fewer operand specifiers, but the 3 operand
forms are used to set the stage for the following problems, and we're only
counting instructions. ]

(1) Using only XORL3, construct an instruction sequence for permuting 3
registers in the following manner:

Initially:	R0 = x		R1 = y		R2 = z
Finally:	R0 = z		R1 = x		R3 = y

(2) Find the minimum length of such a sequence, and PROVE that it's impossible
to find a shorter sequence (for many, solving (1) will do most of the work of
this proof).

[ I don't have answers to numbers beyond (3) yet; in fact, I don't know if
there are any.  I have a few ideas on how I would proceed. ]

(3) Can you find a sequence that is dramatically shorter (say 2 instructions
shorter than that of 2) by adding a SINGLE use of some instruction other than
XORL3 or XORL2 ? Again, no auxiliary storage is allowed (i.e. permutation
must occur in place, no scratch registers ).

(4) Now relax the condition in (3) on auxiliary storage by allowing ANY
number of scratch registers (but no non-I-stream memory references).
Can you do better than the obvious MOVL sequence using a single temp
register (checkmate in 4 :-)?  You must NOT use preloaded constants
in the scratch registers.

(5) Same problem as (4), but now you may assume constants arbitrarily preloaded
in scratch registers.
T.RTitleUserPersonal
Name
DateLines
1411.1The easy part...CIVAGE::LYNNLynn Yarbrough @WNP DTN 427-5663Wed Apr 03 1991 20:0021
>(1) Using only XORL3, construct an instruction sequence for permuting 3
>registers in the following manner:
>
>Initially:	R0 = x		R1 = y		R2 = z
>Finally:	R0 = z		R1 = x		R3 = y
>
>(2) Find the minimum length of such a sequence, and PROVE that it's impossible
>to find a shorter sequence (for many, solving (1) will do most of the work of
>this proof).
(Let + = XOR)
	1) r0+r1 -> r0	(x+y)
	2) r1+r2 -> r1  (y+z)
	3) r2+r0 -> r2  (x+y+z)
	4) r2+r1 -> r2  (x)
	5) r0+r2 -> r0  (y)
	6) r1+r0 -> r1  (z)
(I think I did that backwards, but the other way is symmetric.)

Proof: it takes one operation per register to transfer its contents to
another register, and one operation to clear the other contents out of the
latter register, so two operations * three registers = 6. 
1411.2proving the easy is hardCLT::TRACE::GILBERTOwnership ObligatesThu Apr 04 1991 11:4313
Sorry, Lynn, but I don't buy that proof:

>Proof: it takes one operation per register to transfer its contents to
>another register, and one operation to clear the other contents out of the
 latter register, so 2 operations per register * 3 registers = 6 operations.
(I tweaked the last line to make the physicists happy)

You say it takes one read per register and one write per register.  Right.
But an operation (or instruction) can do *two* reads and a write.  So the
`proof' only succeeds in showing that at least 3 operations are needed,
since fewer than 3 instructions cannot provide the 3 writes.  The proof
*doesn't* show that 3 instructions are insufficient -- 3 instructions will
provide 6 reads and 3 writes (of the 3 and 3 that the proof says are needed).
1411.3CLT::TRACE::GILBERTOwnership ObligatesFri Apr 05 1991 19:306
Each register is in one of 2^3 states: 0,x,y,z,x+y,y+z,z+x,x+y+z.
Since there are three registers, there are (2^3)^3 = 2^9 = 512 states.
It's straight-forward (with a computer) to find a shortest path thru
these state transitions.  This search shows there are two states `farthest'
from the initial state -- the two `rotations' of the input registers,
and that these are reached in 6 transitions.
1411.4XOR2 vs XOR3CADSYS::COOPERTopher CooperFri Apr 05 1991 20:5714
    NOTE: Three register XOR instructions do not help in solving the
    first problem.

    Let the three registers all be of width w.  The amount of information
    in the three registers is thus 3w bits at the outset.  Since we need
    3w bits of information to remember the initial contents, and we have
    ("between" instructions) 3w bits of information, information can never
    be redundantly stored.  We must never throw away information.  But an
    instruction which places the XOR of two of the registers in a third,
    erases the information previously held in the third register.  Once
    such an operation has been done, therefore, the original contents
    cannot be recreated.

				Topher