Arndt - Algorithms for Programmers (523138), страница 6
Текст из файла (страница 6)
THE FOURIER TRANSFORM24f[r]= a0 + a1;f[r+1] = a0 - a1;}}ldm += LX;for ( ; ldm<=ldn ; ldm+=LX){ulong m = (1<<ldm);ulong m4 = (m>>LX);double ph0 = s2pi/m;for (ulong j=0; j<m4; j++){double phi = j*ph0;double c, s, c2, s2, c3, s3;sincos(phi, &s, &c);sincos(2.0*phi, &s2, &c2);sincos(3.0*phi, &s3, &c3);Complex e = Complex(c,s);Complex e2 = Complex(c2,s2);Complex e3 = Complex(c3,s3);}}}for (ulong r=0, i0=j+r; r<n; r+=m, i0+=m){ulong i1 = i0 + m4;ulong i2 = i1 + m4;ulong i3 = i2 + m4;Complex a0 = f[i0];Complex a1 = f[i2]; // (!)Complex a2 = f[i1]; // (!)Complex a3 = f[i3];a1 *= e;a2 *= e2;a3 *= e3;Complex t0 = (a0+a2) + (a1+a3);Complex t2 = (a0+a2) - (a1+a3);Complex t1 = (a0-a2) + Complex(0,is) * (a1-a3);Complex t3 = (a0-a2) - Complex(0,is) * (a1-a3);f[i0] = t0;f[i1] = t1;f[i2] = t2;f[i3] = t3;}[FXT: fft dit4l in fft/fftdit4l.cc][FXT: fft dit4 and fft dit4 core in fft/fftdit4.cc]Code 1.9 (radix 4 DIF FFT) Pseudo code for a radix 4 DIF FFT on the array a[], the data lengthn must be a power of 2, is must be +1 or -1:procedure fftdif4(a[],ldn,is)// complex a[0..2**ldn-1] input, result{n := 2**ldnfor ldm := ldn to 2 step -2{m := 2**ldmmr := m/4for j := 0 to mr-1{e := exp(is*2*PI*I*j/m)e2 := e * ee3 := e2 * eCHAPTER 1.
THE FOURIER TRANSFORMfor r := 0 to n-1 step m{u0 := a[r+j]u1 := a[r+j+mr]u2 := a[r+j+mr*2]u3 := a[r+j+mr*3]x := u0 + u2y := u1 + u3t0 := x + y // == (u0+u2)t1 := x - y // == (u0+u2)x := u0 - u2y := (u1 - u3)*I*ist2 := x + y // == (u0-u2)t3 := x - y // == (u0-u2)t1 := t1 * et2 := t2 * e2t3 := t3 * e3a[r+j]:= t0a[r+j+mr]:= t2 // (!)a[r+j+mr*2] := t1 // (!)a[r+j+mr*3] := t3}}25+ (u1+u3)- (u1+u3)+ (u1-u3)*I*is- (u1-u3)*I*is}}if is_odd(ldn) then // n not a power of 4{for r:=0 to n-1 step 2{{a[r], a[r+1]} := {a[r]+a[r+1], a[r]-a[r+1]}}}revbin_permute(a[],n)[FXT: fft dif4l in fft/fftdif4l.cc][FXT: fft dif4 and fft dif4 core in fft/fftdif4.cc]Note the ‘swapped’ order in which t1, t2 are copied back in the innermost loop, this is whatradix_permute(u[], r, p) was supposed to do.The multiplication by the imaginary unit (in the statement y := (u1 - u3)*I*is) should of course beimplemented without any multiplication statement: one could unroll it as(dr,di) := u1 - u2if is>0 thenelse// dr,di = real,imag part of differencey := (-di,dr) // use (a,b)*(0,+1) == (-b,a)y := (di,-dr) // use (a,b)*(0,-1) == (b,-a)In section 1.8 it is shown how the if-statement can be eliminated.If n is not a power of 4, then ldm is odd during the procedure and at the last pass of the main loop onehas ldm=1.To improve the performance one will instead of the (extracted) radix 2 loop supply extracted radix 8 andradix 4 loops.
Then, depending on whether n is a power of 4 or not one will use the radix 4 or the radix8 loop, respectively. The start of the main loop then has to befor ldm := ldn to 3 step -Xand at the last pass of the main loop one has ldm=3 or ldm=2.The radix_permute() procedure is given in section 7.2 on page 126.1.7Split radix Fourier transforms (SRFT)The idea underlying the split radix FFT is to use both radix-2 and radix-4 decompositions at the sametime.CHAPTER 1. THE FOURIER TRANSFORM26From the radix-2 (DIF) decomposition (relations 1.73 and 1.74) we use the first, the one for the evenindices. For the odd indices we use the radix-4 splitting (relations 1.76 and 1.78, slightly reordered).Idea 1.7 (split radix 4/2 DIF step) Radix 4 decimation in frequency step for the split radix FFT:h³´in/2(0%2)F [a]= F a(0/2) + a(1/2)(1.81)h³³´³´´in/4(1%4)F [a]= F S 1/4 a(0/4) − a(2/4) + i σ a(1/4) − a(3/4)(1.82)h³³´³´´in/4(3%4)F [a]= F S 3/4 a(0/4) − a(2/4) − i σ a(1/4) − a(3/4)(1.83)Now we have expressed the length-N = 2n FFT as one length-N/2 and two length-N/4 FFTs.
Notethat S 3/4 = S −1/4 which means a saving in the trigonometric computations. The nice feature is that theoperation count of the split radix FFT is actually lower than that of the radix-4 FFT.Using our nice notation it is almost trivial to write down the DIT version of the algorithm:Idea 1.8 (split radix 4/2 DIT step) Radix 4 decimation in frequency step for the split radix³ hihi´n/2(0/2)F [a]=F a(0%2) + S 1/2 F a(1%2)³ hihi´³ hihi´n/4(1/4)F [a]=F a(0%4) − S 2/4 F a(2%4) + iσS 1/4 F a(1%4) − S 2/4 F a(3%4)³ hihi´³ hihi´n/4(3/4)F [a]=F a(0%4) − S 2/4 F a(2%4) − iσS 1/4 F a(1%4) − S 2/4 F a(3%4)FFT:(1.84)(1.85)(1.86)Code 1.10 (split radix DIF FFT) Pseudo code for the split radix DIF algorithm, is must be -1 or +1:procedure fft_splitradix_dif(x[],y[],ldn,is){n := 2**ldnif n<=1 returnn2 := 2*nfor k:=1 to ldn{n2 := n2 / 2n4 := n2 / 4e := 2 * PI / n2for j:=0 to n4-1{a := j * ecc1 := cos(a)ss1 := sin(a)cc3 := cos(3*a) // == 4*cc1*(cc1*cc1-0.75)ss3 := sin(3*a) // == 4*ss1*(0.75-ss1*ss1)ix := jid := 2*n2while ix<n-1{i0 := ixwhile i0 < n{i1 := i0 + n4i2 := i1 + n4i3 := i2 + n4{x[i0], r1} := {x[i0] + x[i2], x[i0]{x[i1], r2} := {x[i1] + x[i3], x[i1]{y[i0], s1} := {y[i0] + y[i2], y[i0]{y[i1], s2} := {y[i1] + y[i3], y[i1]{r1, s3} := {r1+s2, r1-s2}{r2, s2} := {r2+s1, r2-s1}-x[i2]}x[i3]}y[i2]}y[i3]}CHAPTER 1.
THE FOURIER TRANSFORM// complex mult:x[i2] := r1*cc1y[i2] := -s2*cc1// complex mult:x[i3] := s3*cc3y[i3] := r2*cc3i0 := i0 + id}}}27(x[i2],y[i2]) := -(s2,r1) * (ss1,cc1)- s2*ss1- r1*ss1(y[i3],x[i3]) := (r2,s3) * (cc3,ss3)+ r2*ss3- s3*ss3}ix := 2 * id - n2 + jid := 4 * id}ix := 1id := 4while ix<n{for i0:=ix-1 to n-id step id{i1 := i0 + 1{x[i0], x[i1]} := {x[i0]+x[i1], x[i0]-x[i1]}{y[i0], y[i1]} := {y[i0]+y[i1], y[i0]-y[i1]}}ix := 2 * id - 1id := 4 * id}revbin_permute(x[],n)revbin_permute(y[],n)if is>0{for j:=1 to n/2-1{swap(x[j],x[n-j])swap(y[j],y[n-j])}}[FXT: split radix fft in fft/fftsplitradix.cc] uses a DIF core as above (and given in [28]).For the (type-) complex version of the SRFT [FXT: split radix fft in fft/cfftsplitradix.cc] bothDIF and DIT core routines have been implemented.1.8Inverse FFT for freeSuppose you programmed some FFT algorithm just for one value of is, the sign in the exponent.
Thereis a nice trick that gives the inverse transform for free, if your implementation uses separate arrays forreal and imaginary part of the complex sequences to be transformed. If your procedure is something likeprocedure my_fft(ar[], ai[], ldn) // only for is==+1 !// real ar[0..2**ldn-1] input, result, real part// real ai[0..2**ldn-1] input, result, imaginary part{// incredibly complicated code// that you cannot see how to modify// for is==-1}Then you don’t need to modify this procedure at all in order to get the inverse transform. If you wantthe inverse transform somewhere then just, instead ofmy_fft(ar[], ai[], ldn)// forward ffttypemy_fft(ai[], ar[], ldn)// backward fftCHAPTER 1.
THE FOURIER TRANSFORM28Note the swapped real- and imaginary parts ! The same trick works if your procedure coded for fixedis= −1.To see, why this works, we first note thatF [a + i b] = F [aS ] + i σ F [aA ] + i F [bS ] + σ F [bA ]= F [aS ] + i F [bS ] + i σ (F [aA ] − i F [bA ])(1.87)(1.88)and the computation with swapped real- and imaginary parts givesF [b + i a] =F [bS ] + i F [aS ] + i σ (F [bA ] − i F [aA ])(1.89). . . but these are implicitly swapped at the end of the computation, givingF [aS ] + i F [bS ] − i σ (F [aA ] − i F [bA ])=F −1 [a + i b](1.90)When the type Complex is used then the best way to achieve the inverse transform may be to reversethe sequence according to the symmetry of the FT ([FXT: reverse nh in perm/reverse.h], reorderingby k 7→ k −1 mod n).
While not really ‘free’ the additional work shouldn’t matter in most cases.With real-to-complex FTs (R2CFT) the trick is to reverse the imaginary part after the transform. Obviously for the complex-to-real FTs (R2CFT) one has to reverse the imaginary part before the transform.Note that in the latter two cases the modification does not yield the inverse transform but the one withthe ‘other’ sign in the exponent.
Sometimes it may be advantageous to reverse the input of the R2CFTbefore transform, especially if the operation can be fused with other computations (e.g. with copying inor with the revbin-permutation).1.9Real valued Fourier transformsThe Fourier transform of a purely real sequence c = F [a] where a ∈ R has6 a symmetric real part(<c̄ = <c) and an antisymmetric imaginary part (=c̄ = −=c). Simply using a complex FFT for realinput is basically a waste of a factor 2 of memory and CPU cycles.
There are several ways out:• sincos wrappers for complex FFTs• usage of the fast Hartley transform• a variant of the matrix Fourier algorithm• special real (split radix algorithm) FFTsAll techniques have in common that they store only half of the complex result to avoid the redundancydue to the symmetries of a complex FT of purely real input.
The result of a real to (half-) complex FT(abbreviated R2CFT) must contain the purely real components c0 (the DC-part of the input signal)and, in case n is even, cn/2 (the Nyquist frequency part). The inverse procedure, the (half-) complex toreal transform (abbreviated C2RFT) must be compatible to the ordering of the R2CFT. All procedurespresented here use the following scheme for the real part of the transformed sequence c in the outputarray a[]:6 cf.relation 1.20a[0]a[1]==<c0<c1a[2]=...<c2a[n/2]=<cn/2(1.91)CHAPTER 1. THE FOURIER TRANSFORM29For the imaginary part of the result there are two schemes:Scheme 1 (‘parallel ordering’) isa[n/2 + 1]a[n/2 + 2]a[n/2 + 3]a[n − 1]===...==c1=c2=c3===...==cn/2−1=cn/2−2=cn/2−3(1.92)=cn/2−1Scheme 2 (‘antiparallel ordering’) isa[n/2 + 1]a[n/2 + 2]a[n/2 + 3]a[n − 1](1.93)=c1Note the absence of the elements =c0 and =cn/2 which are zero.1.9.1Real valued FT via wrapper routinesA simple way to use a complex length-n/2 FFT for a real length-n FFT (n even) is to use some postand preprocessing routines.
For a real sequence a one feeds the (half length) complex sequence f =a(even) + i a(odd) into a complex FFT. Some post-processing is necessary. This is not the most elegantreal FFT available, but it is directly usable to turn complex FFTs of any (even) length into a real-valuedFFT.TBD: formulas for realFFTwrapHere is the C++ code for a real to complex FFT (R2CFT):voidwrap_real_complex_fft(double *f, ulong ldn, int is/*=+1*/)//// ordering of output:// f[0]= re[0](DC part, purely real)// f[1]= re[n/2] (nyquist freq, purely real)// f[2]= re[1]// f[3]= im[1]// f[4]= re[2]// f[5]= im[2]//...// f[2*i]= re[i]// f[2*i+1] = im[i]//...// f[n-2]= re[n/2-1]// f[n-1]= im[n/2-1]//// equivalent:// { fht_real_complex_fft(f, ldn, is); zip(f, n); }//{if ( ldn==0 ) return;fht_fft((Complex *)f, ldn-1, +1);const ulong n = 1UL<<ldn;const ulong nh = n/2, n4 = n/4;const double phi0 = M_PI / nh;for(ulong i=1; i<n4; i++){ulong i1 = 2 * i;// re low [2, 4, ..., n/2-2]ulong i2 = i1 + 1; // im low [3, 5, ..., n/2-1]CHAPTER 1.
THE FOURIER TRANSFORMulong i3 = n - i1; // re hi [n-2, n-4, ..., n/2+2]ulong i4 = i3 + 1; // im hi [n-1, n-3, ..., n/2+3]double f1r, f2i;sumdiff05(f[i3], f[i1], f1r, f2i);double f2r, f1i;sumdiff05(f[i2], f[i4], f2r, f1i);double c, s;double phi = i*phi0;SinCos(phi, &s, &c);double tr, ti;cmult(c, s, f2r, f2i, tr, ti);// f[i1] = f1r + tr; // re low// f[i3] = f1r - tr; // re hi// =^=sumdiff(f1r, tr, f[i1], f[i3]);// f[i4] = is * (ti + f1i); //// f[i2] = is * (ti - f1i); //// =^=if ( is>0 ) sumdiff( ti, f1i,elsesumdiff(-ti, f1i,}}sumdiff(f[0], f[1]);if ( nh>=2 ) f[nh+1] *= is;im hiim lowf[i4], f[i2]);f[i2], f[i4]);TBD: eliminate if-statement in loopC++ code for a complex to real FFT (C2RFT):voidwrap_complex_real_fft(double *f, ulong ldn, int is/*=+1*/)//// inverse of wrap_real_complex_fft()//// ordering of input:// like the output of wrap_real_complex_fft(){if ( ldn==0 ) return;const ulong n = 1UL<<ldn;const ulong nh = n/2, n4 = n/4;const double phi0 = -M_PI / nh;for(ulong i=1; i<n4; i++){ulong i1 = 2 * i;// re low [2, 4, ..., n/2-2]ulong i2 = i1 + 1; // im low [3, 5, ..., n/2-1]ulong i3 = n - i1; // re hi [n-2, n-4, ..., n/2+2]ulong i4 = i3 + 1; // im hi [n-1, n-3, ..., n/2+3]double f1r, f2i;// double f1r = f[i1] + f[i3]; // re symm// double f2i = f[i1] - f[i3]; // re asymm// =^=sumdiff(f[i1], f[i3], f1r, f2i);double f2r, f1i;// double f2r = -f[i2] - f[i4]; // im symm// double f1i = f[i2] - f[i4]; // im asymm// =^=sumdiff(-f[i4], f[i2], f1i, f2r);double c, s;double phi = i*phi0;SinCos(phi, &s, &c);double tr, ti;cmult(c, s, f2r, f2i, tr, ti);// f[i1] = f1r + tr; // re low// f[i3] = f1r - tr; // re hi// =^=sumdiff(f1r, tr, f[i1], f[i3]);30CHAPTER 1.