{___________________________________________________________________________ RFFT2.ASM ADSP-21020 Radix-2 DIT Real FFT Does a radix-2 FFT of length 32 or greater on input data x(n). This is accomplished by doing a N/2 (HN) complex FFT of y(n), where Re[y(n)] = x(2n) and Im[y(n)] = x(2n+1) for n=0,1,...,N-1. This is followed by some calculation to derive X(k) from Y(k). This method has been refered to as "Packing". Input data is NOT destroyed during the course of this routine. N real normal-ordered input in DM. N/2+1 real part of fft stored in DM, N/2+1 imag part of fft stored in PM. Two N/4 long Cosine tables stored in PM, Two N/4 long Sine tables in DM. Note that only (N/2+1) elements are necessary to define the real FFT spectra because of its Hermitian (complex conjugate) symmetry. Based on FFTRAD2 by Kapriel Karagozian, Analog Devices DSP Div.(617) 461-3672 Author: 10-APR-90, Ronnin Yee , Analog Devices DSP Div.(617) 461-3672 Calling Information: pm(cosine[N/4]) - cos(2pi*n/HN) table from RFFT2TBL program pm(h_cosine[N/4]) - 0.5*cos(pi*n/HN) table from RFFT2TBL program dm(sine[N/4) - sin(2pi*n/HN) table from RFFT2TBL program dm(h_sine[N/4) - 0.5*sin(pi*n/HN) table from RFFT2TBL program dm(real[N]) - real input array, bitreved to working array (Note: Because the bit reversed address mode is used with the array real, it must start at addresses that are integer multiples of the length (N) of the transform, (i.e. 0,N,2N,3N,...). This is accomplished by specifing the segment starting at that addresses in the architecture file and placing the variable alone in its own segment. This address must also be reflected in the preprocessor variables IRE and IREP1 in bit reversed format.) Results: dm(refft[N/2+1]) - real working array and output pm(imfft[N/2+1]) - imaginary working array and output Benchmarks: Radix-2, real time (ms) FFT Length cycles 20MHz 25MHZ ---------- ------ -------- -------- 64 636 .032 .026 128 1226 .061 .049 256 2504 .125 .100 512 5286 .264 .211 1024 11331 .566 .453 2048 24418 1.22 .977 4096 52608 2.63 2.10 8192 113054 5.65 4.52 First 2 Stages - 8 cycles per 4 butterflies Middle Stages - 4 cycles per butterfly 2nd to Last Stage - 9 cycles per 2 butterflies Last FFT Stage - 5 cycles per butterfly group Conversion Stage - 6 cycles per 2 elements (N/2 elements total) Memory Usage: pm code = 187 words, pm data = N+1 words, dm data = 2*N+1 words ____________________________________________________________________________} { Include for symbolic definition of system register bits } #include "demo.h" {________These constants are independent of the number of points____________} #define BFLY8 4 {Offset between bf branches in a group of 8} .SEGMENT/DM dm_sram; .VAR sine[HN/2] = "ts.dat"; { imag twiddle factors, from RFFT2TBL } .VAR h_sine[HN/2]= "ths.dat"; { end comp. sin coeff. from RFFT2TBL} .GLOBAL sine; .GLOBAL h_sine; .ENDSEG; .SEGMENT/DM dm_algn0; { Segment Addr. = integer multiple of N } .VAR real[N]; { input real array } .GLOBAL real; .ENDSEG; .SEGMENT/DM dm_algn1; { Segment Addr. = integer multiple of N } .VAR refft[HN+1]; { real part of fft} .GLOBAL refft; { Set on N boundry cause used by irfft2} .ENDSEG; .SEGMENT/PM pm_sram; .VAR cosine[HN/2]= "tc.dat"; { real twiddle factors, from RFFT2TBL } .VAR h_cosine[HN/2]= "thc.dat"; { end comp. cos coeff. from RFFT2TBL} .VAR imfft[HN+1]; { imag result } .GLOBAL cosine; .GLOBAL h_cosine; .GLOBAL imfft; .ENDSEG; .GLOBAL rfft2; .SEGMENT/PM pm_sram; {______________________________begin FFT__________________________________} rfft2: bit set MODE1 BR0; { enable bit reverse of i0 } {Do bitrev and packing within first two stages} b0=IREP1; { Points to beginning of odd indices to be read in } { bit reversed order } l0=0; m0=BRMODIFY; b8=imfft; l8=HN; m8=1; {Store bit-reversed odd indices in pm and consider them imaginary} f0=dm(i0,m0); LCNTR=HN-1, DO (PC,1) UNTIL LCE; f0=dm(i0,m0), pm(i8,m8)=f0; pm(i8,m8)=f0; {Now bitrev even indices within first two stages} b0=IRE; { Points to input real array to be read in } { bit reversed order } b2=refft; l2=HN; m1=1; { This loop increments forward +1} i8=imfft; b10=imfft; l10=HN; {Do the first two stages (actually a radix-4 FFT stage)} f0=dm(i0,m0), f1=pm(i8,m8); f2=dm(i0,m0), f3=pm(i8,m8); f0=f0+f2, f2=f0-f2, f4=dm(i0,m0), f5=pm(i8,m8); f1=f1+f3, f3=f1-f3, f6=dm(i0,m0), f7=pm(i8,m8); f4=f6+f4, f6=f6-f4; f5=f5+f7, f7=f5-f7; f8=f0+f4, f9=f0-f4; f10=f1+f5, f11=f1-f5; LCNTR=HN/4, do FSTAGE until LCE; { do N/4 simple radix-4 butterflies } f12=f2+f7, f13=f2-f7, f0=dm(i0,m0), f1=pm(i8,m8); f14=f3+f6, f15=f3-f6, f2=dm(i0,m0), f3=pm(i8,m8); f0=f0+f2, f2=f0-f2, f4=dm(i0,m0), f5=pm(i8,m8); f1=f1+f3, f3=f1-f3, f6=dm(i0,m0), f7=pm(i8,m8); f4=f6+f4, f6=f6-f4, dm(i2,m1)=f8, pm(i10,m8)=f10; f5=f5+f7, f7=f5-f7, dm(i2,m1)=f12, pm(i10,m8)=f14; f8=f0+f4, f9=f0-f4, dm(i2,m1)=f9, pm(i10,m8)=f11; FSTAGE: f10=f1+f5, f11=f1-f5, dm(i2,m1)=f13, pm(i10,m8)=f15; {middle stages loop } bit clr MODE1 BR0; {finished with bitreversal} b0=refft; l0=HN; b1=sine; l1=@sine; b9=cosine; l9=@cosine; b10=imfft; l10=HN; b11=imfft; l11=HN; m0=-BFLY8; m1=-HN/8; m2=-BFLY8-1; m9=-HN/8; m11=-1; r2=2; r3=-BFLY8; {initializes m0,10 - incr for butterf branches } r5=BFLY8; {counts # butterflies per a group } r9=(-2*BFLY8)-1; {initializes m12 - wrap around to next grp + 1 } r10=-2*BFLY8; {initializes m8 - incr between groups } r13=-BFLY8-1; {initializes m2,13 - wrap to bgn of 1st group } r15=HN/8; {# OF GROUPS IN THIRD STAGE} f1=dm(i1,m1), f7=pm(i9,m9); {set pointers to tables to 1st coeff. } LCNTR=STAGES-5, do end_stage until LCE; {# OF STAGES TO BE HANDLED = LOG2(HN)-4} m8=r10; m10=r3; m12=r9; i0=refft+HN-1; i2=refft+HN-1; i8=imfft+HN-1; i10=imfft+HN-1; i11=imfft+HN-1; r15=r15-r2, m13=r13; {CALCULATE # OF CORE } {BFLIES/GROUP IN THIS STAGE} f0=dm(i1,m1), f7=pm(i8,m8); f12=f0*f7, f6=dm(i0,m0), f1=pm(i9,m9); f8=f1*f6, modify(i11,m10); f11=f1*f7, f7=pm(i8,m8); f14=f0*f6, f12=f8+f12, f8=dm(i0,m0); f12=f0*f7, f13=f8+f12, f10=f8-f12, f6=dm(i0,m0); {Each iteration does another set of bttrflys in each group} LCNTR=r5, do end_group until LCE; {# OF BUTTERFLIES/GROUP IN THIS STAGE} {core butterfly loop} LCNTR=r15, do end_bfly until LCE; {Do a butterfly in each group - 2} f8=f1*f6, f14=f11-f14, dm(i2,m0)=f10, f9=pm(i11,m8); f11=f1*f7, f3=f9+f14, f9=f9-f14, dm(i2,m0)=f13, f7=pm(i8,m8); f14=f0*f6, f12=f8+f12, f8=dm(i0,m0), pm(i10,m10)=f9; end_bfly: f12=f0*f7, f13=f8+f12, f10=f8-f12, f6=dm(i0,m0), pm(i10,m10)=f3; {finish up last bttrfly and set up for next stage} f8=f1*f6, f14=f11-f14, dm(i2,m0)=f10, f9=pm(i11,m8); f11=f1*f7, f4=f9+f14, f9=f9-f14, dm(i2,m0)=f13, f14=pm(i8,m11); f14=f0*f6, f12=f8+f12, f8=dm(i0,m2), pm(i10,m10)=f9; f13=f8+f12, f10=f8-f12, f0=dm(i1,m1), f7=pm(i8,m8); {dm:sin} f14=f11-f14, dm(i2,m0)=f10, f9=pm(i11,m12); {start on next butterfly in each group} f12=f0*f7, f3=f9+f14, f9=f9-f14, f6=dm(i0,m0), f1=pm(i9,m9); {pm:cos} f8=f1*f6, dm(i2,m2)=f13, pm(i10,m10)=f4; f11=f1*f7, pm(i10,m10)=f9; f14=f0*f6, f12=f8+f12, f8=dm(i0,m0), f7=pm(i8,m8); end_group: f12=f0*f7, f13=f8+f12, f10=f8-f12, f6=dm(i0,m0), pm(i10,m13)=f3; r4=r15+r2, i1=b1; {PREPARE R4 FOR #OF BFLIES CALC} r15=ashift r4 by -1; {# OF BFLIES/GRP IN NEXT STAGE} r4=-r15, i9=b9; m1=r4; {update inc for sin & cos } m9=r4; r5=ashift r5 by 1, f1=dm(i1,m1); {update # bttrfly in a grp} r3=-r5; { inc for bttrfly branch} r13=r3-1, m0=r3; { wrap to 1st grp } r10=ashift r3 by 1, f7=pm(i9,m9); { inc between grps } end_stage: r9=r10-1, m2=r13; { wrap to grp +1 } {_________ next to last stage__________} m1=-2; {modifier to sine table pntr } m8=r10; {incr between groups } m9=-2; {modifier to cosine table pntr } m10=r3; {incr between bttrfly branches } m12=r9; {wrap around to next grp + 1 } m13=r13; {wrap to bgn of 1st group } i0=refft+HN-1; i1=sine+(HN/2)-2; {pntr to 1st sine coeff } i2=refft+HN-1; i8=imfft+HN-1; i9=cosine+(HN/2)-2; {pntr to 1st cosine coeff } i10=imfft+HN-1; i11=imfft+HN-1; f0=dm(i1,m1), f7=pm(i8,m8); f12=f0*f7, f6=dm(i0,m0), f1=pm(i9,m9); f8=f1*f6, modify(i11,m10); f11=f1*f7, f7=pm(i8,m12); f14=f0*f6, f12=f8+f12, f8=dm(i0,m0); f12=f0*f7, f13=f8+f12, f10=f8-f12, f6=dm(i0,m0); {Do the HN/4 butterflies in the two groups of this stage} LCNTR=HN/4, do end_group2 until LCE; f8=f1*f6, f14=f11-f14, dm(i2,m0)=f10, f9=pm(i11,m8); f11=f1*f7, f3=f9+f14, f9=f9-f14, dm(i2,m0)=f13, f1=pm(i9,m9); f14=f0*f6, f12=f8+f12, f8=dm(i0,m2), pm(i10,m10)=f9; f13=f8+f12, f10=f8-f12, f0=dm(i1,m1), f7=pm(i8,m8); f12=f0*f7, f14=f11-f14, f6=dm(i0,m0), f9=pm(i11,m12); f8=f1*f6, f3=f9+f14, f9=f9-f14, dm(i2,m0)=f10, pm(i10,m10)=f3; f11=f1*f7, dm(i2,m2)=f13, pm(i10,m10)=f9; f14=f0*f6, f12=f8+f12, f8=dm(i0,m0), f7=pm(i8,m12); end_group2: f12=f0*f7, f13=f8+f12, f10=f8-f12, f6=dm(i0,m0), pm(i10,m13)=f3; { The last stage } m0=-HN/2; m2=-HN/2-1; m10=m0; m13=m2; i0=refft+HN-1; i1=sine+(HN/2)-1; {pntr to 1st sine coeff } i2=refft+HN-1; i8=imfft+HN-1; i9=cosine+(HN/2)-1; {pntr to 1st cosine coeff } i10=imfft+HN-1; i11=imfft+HN-1; m1=-1; {modifiers to coeff tables } m9=-1; {start first bttrfly} f0=dm(i1,m1), f7=pm(i8,m11); f12=f0*f7, f6=dm(i0,m0), f1=pm(i9,m9); f8=f1*f6, modify(i11,m10); f11=f1*f7; f14=f0*f6, f12=f8+f12, f8=dm(i0,m2), f9=pm(i11,m11); {do HN/2 bttrflys in the last stage} LCNTR=HN/2, do last_stage until LCE; f13=f8+f12, f10=f8-f12, f0=dm(i1,m1), f7=pm(i8,m11); f12=f0*f7, f14=f11-f14, f6=dm(i0,m0), f1=pm(i9,m9); f8=f1*f6, f3=f9+f14, f15=f9-f14, dm(i2,m0)=f10, f9=pm(i11,m11); f11=f1*f7, dm(i2,m2)=f13, pm(i10,m10)=f15; last_stage: f14=f0*f6, f12=f8+f12, f8=dm(i0,m2), pm(i10,m13)=f3; {compute the N point FFT of real from the packed N/2 transform} i0=refft; { real src pntr forward} b1=h_sine+1; { real sine pntr} i2=refft+HN-1; { real src pntr backward} i3=refft; { real dest pntr forward} i4=refft+HN; { real dest pntr backward} l4=0; i8=imfft; { imag src pntr forward} b9=h_cosine+1; { imag cosine pntr} i10=imfft+HN-1; { imag src pntr backward} i11=i8; { imag dest pntr forward} i12=imfft+HN; { imag dest pntr backward} l12=0; m0 = 1; m8 = 1; f2 = 0.5; { this register is used to div by 2} f15= 0.0; {do first pair, U(0)+jV(0) & U(HN)+jV(HN)} f12=dm(i0,m0), f9 =pm(i8,m8); f11=f12+f9, f3=f12-f9, f12=dm(i0,m0), f9 =pm(i8,m8); f10=PASS f15, f8 =dm(i2,m1), f13=pm(i10,m9); dm(i3,m0)=f11; {do the rest in pairs (n,HN-n) except the mid-point} f4=f8+f12, f5=f8-f12, f0 =dm(i1,m0), f1 =pm(i9,m8); LCNTR = (HN/2)-1, DO trans UNTIL LCE; f8=f0*f5, f6=f9+f13, f7=f9-f13; f12=f1*f6, f10=f10+f15,f15=f10-f15, dm(i4,m1)=f3, f13=pm(i10,m9); f11=f2*f4, f14=f8+f12, f12=dm(i0,m0), pm(i11,m8)=f10; f10=f1*f5, f11=f11+f14,f3=f11-f14, f8 =dm(i2,m1), pm(i12,m9)=f15; f14=f0*f6, f4=f8+f12, f5=f8-f12, f0 =dm(i1,m0), f1 =pm(i9,m8); trans: f15=f2*f7, f10=f10-f14, dm(i3,m0)=f11, f9 =pm(i8,m8); {finish up and do the mid-point} f10=f10+f15,f15=f10-f15, dm(i4,m1)=f3; f14=-f9, pm(i11,m8)=f10; rts (db); {finished} pm(i12,m9)=f15; dm(i3,m0)=f8, pm(i11,m8)=f14; {_______________________________________________________________________} .ENDSEG;