[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference hydra::amiga_v1

Title:	AMIGA NOTES
Notice:	Join us in the NEW conference - HYDRA::AMIGA_V2
Moderator:	HYDRA::MOORE

Created:	Sat Apr 26 1986
Last Modified:	Wed Feb 05 1992
Last Successful Update:	Fri Jun 06 1997
Number of topics:	5378
Total number of notes:	38326

861.0. "bummed" by VIKING::PRAETORIUS (unineted nucular jewlery relator) Thu Oct 29 1987 13:21

     Saw this in a recent SIGARCH publication, hadta be the first to post it
(right, all you knew about this one already, I'm gonna post it anyway).

	; arg in D0, sgn(arg) comes out in D1 (-1,0,1 for D0 <,=,> 0)

	ADD.L	D0,D0	; Set carry if D0 is negative
	SUBX.L	D1,D1	; D1-D1-carry, i.e., spreads carry through D1 & carry
	NEGX.L	D0	; 0-D0-carry, sets carry if result is nonzero
	ADDX.L	D1,D1	; -1 + -1 + 1 if D0 was negative,
			; 0 + 0 + 1 if D0 was positive
			; 0 + 0 + 0 if D0 was zero

I don't have the rag with me, or I'd post the others.  These code sequences were
all found by a program at CMU called a superoptimizer, which does a (pruned)
tree search of random code segments (up to 11 instructions) to see if they
meet a certain goal.

     Cute, huh?
								RP

T.R	Title	User	Personal Name	Date	Lines
861.1	W O W	HYSTER::DEARBORN	Trouvez Mieux	`Thu Oct 29 1987 13:58`	19
	Amazing... What is it?
861.2	effective usage of the instruction set	16BITS::KRUGER		`Thu Oct 29 1987 19:46`	20
	Very nice -- perhaps we can post a collection of 68000 tricks here? I'll have to go back and look in my black bag.... Here's some minor stuff SUB D0,D0 is faster than CLR D0, by the way. (Aside from it's cute application to affecting the flags.) for (i=9; i>0 && x<>42; i--) ..... BRA 1$ 2$ ..... 1$ CMP x,#42 DBNE i,2$ Note that DBcc is far and away superior to the SOB of DECland.
861.3		STAR::BANKS	In Search of Mediocrity	`Thu Oct 29 1987 19:59`	9
	I'm told (but I forgot from where) that the fastest way to clear a D register is MOVEQ.L #0,D0 Sorta makes you wonder why they even bothered with CLR.L when everything else is faster. (Then again, I fondly remember a machine whose JUMPA was slower than other forms of unconditional branching, so I guess this is the same, right?)
861.4	more from SIGARCH	STAR::BANKS	In Search of Mediocrity	`Fri Oct 30 1987 00:21`	107
	Other bummages from the same source as .0 (he was reading my copy, after all (and don't they always start publishing all this interesting stuff right as I cancel my subscription?)): Not quite as dramatic as the SIGNUM in .0, but copied as close to verbatim as I can get: I.3 Max and Min This program finds the maximum of the unsigned numbers in D0 and D1 and returns the answer in D0. The comments on the right show what's in the various registers during execution and is similar to the boolean expression checker's method of analysis: (D0=X, D1=Y) \|Flag,Reg\|If D1>D0 \|If C1<=D0 SUB.L D1,D0 \|(C,D0) =\|(1, X-Y) \|(0, X-Y) SUBX.L D2,D2 \|(C,D2) =\|(1,11..11) \|(0,0...0) OR.L D2,D0 \|(C,D0) =\|(1,11..11) \|(0,X-Y) ADDX.L D1,D0 \|D0 = \|Y \|X (D0 = max(X, Y)) This program finds the minimum of the unsigned numbers in D0 and D1 and returns the answer in D0. (D0=X, D1=Y) \|Flag,Reg\|If D1>D0 \|If D1<=D0 SUB.L D1,D0 \|(C,D0) =\|(1, X-Y) \|(0, X-Y) SUBX.L D2,D2 \|D2 = \|111...111 \|000...000 AND.L D2,D0 \|D0 = \|X-Y \|0 ADD.L D1,D0 \|D0 = \|X \|Y (D0 = min(X, Y)) Simultaneous min and max. (D0=X, D1=Y) \|Flag,Reg\|If D1>D0 \|If D1<=D0 SUB.L D1,D0 \|(C,D0) =\|(1, X-Y) \|(0, X-Y) SUBX.L D2,D2 \|D2 = \|111...111 \|000...000 AND.L D0,D2 \|D2 = \|X-Y \|0 EOR.L D2,D0 \|D0 = \|0 \|X-Y ADD.L D1,D0 \|D0 = \|Y \|X ADD.L D2,D1 \|D1 = \|X \|Y (D0 = max(X, Y), D1 = min(X, Y)) I.5 Decimal to Binary This piece converts a 8 digit BCD number stored in D0, one digit to a nibble, to binary with the result also in D0. It is the longest sequence generated by superoptimizer, and was actually done in three stages. This idea that this was even possible came while generating sequences to multiply by 10. At first I had superoptimizer compute the 2 digit BCD to binary conversion function '((D0 & 0xF0) >> 4) * 10 + (D0 & 0x0F)'. This came out surprisingly short: (2 digit BCD number in D0) MOVE.B D0,D1 AND.B #$F0,D1 LSR.B #3,D1 SUB.B D1,D0 SUB.B D1,D0 SUB.B D1,D0 (binary equivalent in D0) What is actually being computed is: ans = D0 - 3 * ((D0 & 0xF0) / 8) Representing the contents of D0 as (H:L) where H is the upper nibble and L is the lower nibble, we get: D0 = 16 * H + L, D0 & 0xF0 = 16 * H ans = (16 * H + L) - 3 * (16 * H / 8) = 16 * H + L - 6 * H = 10 * H + L which is the 2 digit BCD to binary function. Encouraged by this result, superoptimizer was put to the task of computing first the 4 digit BCD to binary function and then the 8 digit BCD to binary function. Here is the 8 digit converter: (8 digit BCD number in D0) MOVE.L D0,D1 * AND.L #$F0F0F0F0,D1 * LSR.L #3,D1 * SUB.L D1,D0 * SUB.L D1,D0 * SUB.L D1,D0 * MOVE.L D0,D1 + AND.L #$FF00FF00,D1 + LSR.L #1,D1 + SUB.L D1,D0 + LSR.L #2,D1 + SUB.L D1,D0 + LSR.L #3,D1 + ADD.L D1,D0 + MOVE.L D0,D1 - SWAP D1 - MULU #$D8F0,D1 - SUB.L D1,D0 - (binary equivalent in D0) What is most amazing is the first section (marked by * alongside the program). It looks exactly like the 2 digit BCD to binary function. This section computes 4 simultaneous 2 digit BCD to binary functions on adjacent pairs of nibbles and deposits the answer back into the byte occupied by those nibbles. The second part (marked by +) computes two simultaneous 2 byte base 100 to binary conversion functions. Finally, the third part computes the function 'high word of D0 * 10000 + low word of D0' to complete the conversion.