(MRR May 10, 2000)
Files in the linpack directory

linpackc.c		The original LINPACK source from the Netlib website
			To compile it on a standard unix target:
			

linpack_unix.c	For some system configurations, the original source
			will not compile or link.  This has been modified to
			compile on grinch, which does not have "floor" in its
			standard math library.  To compile:
			gcc -lm -DSP -DUNROLL -O4 linpack_unix.c

linpackev.c		This has been modified to run on Excimer and Maximer.
			the modifications can be highlighted by running:
			diff linpackc.c linpackev.c

makefile		MetaWare makefile can be modified to choose any combination
			of options where DSP is for single precision, DDP is for
			double precision, DROLL is for iterative loops, and 
			DUNROLL is for unrolled loops.  This MetaWare version does
			not support AltiVec.

makefile_gcc	The GNU makefile for Altivec enhanced gcc.  It supports the
			previous options as well as DALTIVEC, for an optimized 
			LINPACK execution.

______________________________________________________________________________
Altivec Alignment method in linpackev.c
------------------------------------------------------------------------------
All three LINPACK functions modified to use AltiVec use the following
methodology to align single-FP data-structures.  The structure is broken into
three segments: pre-aligned, vector operations, and post-aligned.
The corresponding variables (pre_align,vops,post_align) should add upto the 
total size (n).  The worst case data structure would start at 0x??4 and
end at 0x??C, doing at most 6 non-vectored operations.  Best case would use
all Altivec instructions.  The following is generic code leaving out the
specific algorithm being implemented: 

	xalign = alignment(dx);
	nxaligndiff = n-xalign;
	pre_align = (nxaligndiff < 0) ? n : xalign;
      if ( pre_align != 0) {
		for (i = 0; i < pre_align; i++)
			...
			//pre-aligned method similar to non-Altivec method
			...	
      }
	vops = 4*(nxaligndiff/4);
	post_align = nxaligndiff % 4;
	if ( vops > 0 ) {
		...
		//All iterated Altivec set-up operations
		...
 		for (i = xalign; i < vops; i += 4)
			... 
			//All AltiVec method operations
			...
	}  
      if ( post_align > 0) {
		for (i = n-post_align; i < n; i++)
			...
			//post-aligned method similar to non-Altivec method
			...
      }
______________________________________________________________________________
LINPACK idamax() method description for AltiVec
------------------------------------------------------------------------------
The LINPACK idamax() routine returns the index of the maximum of the absolute
value of an array of single-floats.  The AltiVec enhanced version uses the
above alignment method, and the following describes one iteration of the
middle Altivec section search method:
Given: a max value, a test vector
1) take the absolute value of the vector (vec_abs)
2) are any in this vector greater than my given max? (vec_any_gt)
	a) No - go to next vector
	b) Yes
		1)is the zeroth element less than or equal to all in the vector?(vec_splat_float and vec_all_le)
			a) Yes - this is the max because none of the others are bigger
				set zeroth element as max
				go to next vector
			b) No
				1)is the first element less than or equal to all in the vector? (vec_splat_float and vec_all_le)
					a) Yes - this is the max because none of the others are bigger
						set first element as max
						go to next vector
					b) No
						1)is the second element less than or equal to all in the vector? (vec_splat_float and vec_all_le)
							a) Yes - this is the max because none of the others are bigger
								set second element as max
								go to next vector
							b) No - the third is the max because thats the only option left
								set third element as max
								go to next vector
Best case:  decreasing value data
Worst case: increasing value data
average case: random data