Arndt - Algorithms for Programmers (523138), страница 5

Файл №523138 Arndt - Algorithms for Programmers (Arndt - Algorithms for Programmers) 5 страницаArndt - Algorithms for Programmers (523138) страница 52013-09-152013-09-15СтудИзба

Arndt - Algorithms for Programmers

Просмтор этого файла доступен только зарегистрированным пользователям. Но у нас супер быстрая регистрация: достаточно только электронной почты!

Регистрация/авторизация

Текст из файла (страница 5)

As in the procedure 1.4 the inner loops where swapped to save trigonometriccomputations.Extracting the ldm==1 stage of the outermost loop is again a good idea:Replace the lineforldm:=ldn to 1 step -1forldm:=ldn to 2 step -1byand insertCHAPTER 1. THE FOURIER TRANSFORM19for r:=0 to n-1 step 2{{a[r], a[r+1]} := {a[r]+a[r+1], a[r]-a[r+1]}}before the call of revbin_permute(a[], n).TBD: extraction of the j=0 case1.5Saving trigonometric computationsThe trigonometric (sin()- and cos()-) computations are an expensive part of any FFT. There are twoapparent ways for saving the involved CPU cycles, the use of lookup-tables and recursive methods.1.5.1Using lookup tablesThe idea is to save all necessary sin/cos-values in an array and later looking up the values needed.

This isa good idea if one wants to compute many FFTs of the same (small) length. For FFTs of large sequencesone gets large lookup tables that can introduce a high cache-miss rate. Thereby one is likely experiencinglittle or no speed gain, even a notable slowdown is possible. However, for a length-n FFT one does notneed to store all the (n complex or 2 n real) sin/cos-values exp(2 π i k/n), k = 0, 1, 2, 3, .

. . , n −1. Alreadya table cos(2 π i k/n), k = 0, 1, 2, 3, . . . , n/4 − 1 (of n/4 reals) contains all different trig-values that occurin the computation. The size of the trig-table is thereby cut by a factor of 8. For the lookups one canuse the symmetry relationscos(π + x)sin(π + x)= − cos(x)= − sin(x)(1.52)(1.53)(reducing the interval from 0 .

. . 2π to 0 . . . π),cos(π/2 + x)sin(π/2 + x)= − sin(x)= + cos(x)(1.54)(1.55)(reducing the interval to 0 . . . π/2) andsin(x) =cos(π/2 − x)(1.56)(only cos()-table needed).1.5.2Recursive generation of the sin/cos-valuesIn the computation of FFTs one typically needs the values{exp(i ω 0) = 1,exp(i ω δ),exp(i ω 2 δ),exp(i ω 3 δ),...}in sequence. The naive idea for a recursive computation of these values is to precompute d = exp(i ω δ)and then compute the next following value using the identity exp(i ω k δ)) = d · exp(i ω (k − 1) δ). Thismethod, however, is of no practical value because the numerical error grows (exponentially) in the process.Here is a stable version of a trigonometric recursion for the computation of the sequence: Precomputec = cos ω,(1.57)s=(1.58)α=βsin ω,1 − cos δδ= 2 (sin )22= sin δcancellation!ok.(1.59)(1.60)(1.61)CHAPTER 1.

THE FOURIER TRANSFORM20Then compute the next power from the previous as:cnextsnext= c − (α c + β s);= s − (α s − β c);(1.62)(1.63)(The underlying idea is to use (with e(x) := exp(2 π i x)) the ansatz e(ω + δ) = e(ω) − e(ω) · z which leadsto z = 1 − cos δ − i sin δ = 2 (sin 2δ )2 − i sin δ.)Do not expect to get all the precision you would get with the repeated call of the sin and cos functions,but even for very long FFTs less than 3 bits of precision are lost. When (in C) working with doublesit might be a good idea to use the type long double with the trig recursion: the sin and cos will thenalways be accurate within the double-precision.A real-world example from [FXT: fht dif core in fht/fhtdif.cc], the recursion is used if TRIG_REC is#defined:[...]double tt = M_PI_4/kh;#if defined TRIG_RECdouble s1 = 0.0, c1 = 1.0;double al = sin(0.5*tt);al *= (2.0*al);double be = sin(tt);#endif // TRIG_RECfor (ulong i=1; i<kh; i++){#if defined TRIG_RECc1 -= (al*(tt=c1)+be*s1);s1 -= (al*s1-be*tt);#elsedouble s1, c1;SinCos(tt*i, &s1, &c1);#endif // TRIG_REC[...]1.5.3Using higher radix algorithmsIt may be less apparent, that the use of higher radix FFT algorithms also saves trig-computations.

Theradix-4 FFT algorithms presented in the next sections replace all multiplications with complex factors(0, ±i) by the obvioussimpler operations. Radix-8 algorithms also simplify the special cases where sin(φ)por cos(φ) are ± 1/2. Apart from the trig-savings higher radix also brings a performance gain by theirmore unrolled structure. (Less bookkeeping overhead, less loads/stores.)1.61.6.1Higher radix DIT and DIF algorithmsMore notationAgain some useful notation, again let a be a length-n sequence.Let a(r%m) denote the subsequence of those elements of a that have subscripts x ≡ r (mod m); e.g. a(0%2)is a(even) , a(3%4) = {a3 , a7 , a11 , a15 , .

. . }. The length of a(r%m) is4 n/m.Let a(r/m) denote the subsequence of those elements of a that have indicesis a(right) , a(2/3) is the last third of a. The length of a(r/m) is also n/m.4 Throughoutthis book will m divide n, so the statement is correct.rnmn. . . (r+1)− 1; e.g. a(1/2)mCHAPTER 1.

THE FOURIER TRANSFORM1.6.221Decimation in timeFirst reformulate the radix 2 DIT step (formulas 1.43 and 1.44) in the new notation:hihin/2(0/2)F [a]= S 0/2 F a(0%2) + S 1/2 F a(1%2)hihin/2(1/2)F [a]= S 0/2 F a(0%2) − S 1/2 F a(1%2)(1.64)(1.65)(Note that S 0 is the identity operator).The radix 4 step, whose derivation is analogous to the radix 2 step, it just involves more writing anddoes not give additional insights, isIdea 1.3 (radix 4 DIT step) Radix 4 decimation in time step for the FFT:hihihihin/4(0/4)F [a]= +S 0/4 F a(0%4) + S 1/4 F a(1%4) + S 2/4 F a(2%4) + S 3/4 F a(3%4)hihihihin/4(1/4)F [a]= +S 0/4 F a(0%4) + iσS 1/4 F a(1%4) − S 2/4 F a(2%4) − iσS 3/4 F a(3%4)hihihihin/4(2/4)F [a]= +S 0/4 F a(0%4) − S 1/4 F a(1%4) + S 2/4 F a(2%4) − S 3/4 F a(3%4)hihihihin/4(3/4)F [a]= +S 0/4 F a(0%4) − iσS 1/4 F a(1%4) − S 2/4 F a(2%4) + iσS 3/4 F a(3%4)(1.66)(1.67)(1.68)(1.69)where σ = ±1 is the sign in the exponent.

In contrast to the radix 2 step, that happens to be identicalfor forward and backward transform (with both decimation frequency/time) the sign of the transformappears here.Or, more compactly:F [a]n/4(j/4)=hihi+eσ 2 i π 0 j/4 · S 0/4 F a(0%4) + eσ 2 i π 1 j/4 · S 1/4 F a(1%4)hihi+eσ 2 i π 2 j/4 · S 2/4 F a(2%4) + eσ 2 i π 3 j/4 · S 3/4 F a(3%4)(1.70)where j = 0, 1, 2, 3 and n is a multiple of 4.Still more compactly:(j/4)F [a]n/4=3Xhieσ2 i π k j/4 · S σk/4 F a(k%4)j = 0, 1, 2, 3(1.71)k=0where the summation symbol denotes element-wise summation of the sequences. (The dot indicatesmultiplication of every element of the rhs.

sequence by the lhs. exponential.)The general radix r DIT step, applicable when n is a multiple of r, is:Idea 1.4 (FFT general DIT step) General decimation in time step for the FFT:F [a](j/r)n/r=r−1Xiheσ 2 i π k j/r · S σ k/r F a(k%r)j = 0, 1, 2, .

. . , r − 1(1.72)k=01.6.3Decimation in frequencyThe radix 2 DIF step (formulas 1.50 and 1.51) was´i³hn/2(0%2)F [a]= F S 0/2 a(0/2) + a(1/2)´i³hn/2(1%2)F [a]= F S 1/2 a(0/2) − a(1/2)(1.73)(1.74)CHAPTER 1. THE FOURIER TRANSFORM22The radix 4 DIF step, applicable for n divisible by 4, isIdea 1.5 (radix 4 DIF step) Radix 4 decimation in frequency step for the FFT:(0%4)n/4(1%4)n/4(2%4)n/4(3%4)n/4(j%4)n/4F [a]F [a]F [a]F [a]====h³´iF S 0/4 a(0/4) +a(1/4) + a(2/4) +a(3/4)h³´iF S 1/4 a(0/4) + i σ a(1/4) − a(2/4) − i σ a(3/4)h³´iF S 2/4 a(0/4) −a(1/4) + a(2/4) −a(3/4)h³´iF S 3/4 a(0/4) − i σ a(1/4) − a(2/4) + i σ a(3/4)(1.75)(1.76)(1.77)(1.78)Or, more compactly:"F [a]=F Sσ j/43X#eσ 2 i π k j/4(k/4)·aj = 0, 1, 2, 3(1.79)k=0the sign of the exponent and in the shift operator is the same as in the transform.The general radix r DIF step isIdea 1.6 (FFT general DIF step) General decimation in frequency step for the FFT:#"r−1Xn/r(j%r)σ j/rσ 2 i π k j/r(k/r)F [a]= F Se·aj = 0, 1, 2, .

. . , r − 1(1.80)k=01.6.4Implementation of radix r = px DIF/DIT FFTsIf r = p 6= 2 (p prime) then the revbin_permute() function has to be replaced by its radix-p version:radix_permute(). The reordering now swaps elements x with x̃ where x̃ is obtained from x by reversingits radix-p expansion (see section 7.2).Code 1.7 (radix px DIT FFT) Pseudo code for a radix r:=px decimation in time FFT:procedure fftdit_r(a[], n, is)// complex a[0..n-1] input, result// p (hardcoded)// r == power of p (hardcoded)// n == power of p (not necessarily a power of r){radix_permute(a[], n, p)lx := log(r) / log(p) // r == p ** lxln := log(n) / log(p)ldm := (log(n)/log(p)) % lxif ( ldm != 0 ) // n is not a power of p{xx := p**lxfor z:=0 to n-1 step xx{fft_dit_xx(a[z..z+xx-1], is) // inlined length-xx dit fft}}for ldm:=ldm+lx to ln step lx{m := p**ldmmr := m/rfor j := 0 to mr-1{CHAPTER 1.

THE FOURIER TRANSFORM}}}23e := exp(is*2*PI*I*j/m)for k:=0 to n-1 step m{// all code in this block should be// inlined, unrolled and fused:// temporary u[0..r-1]for z:=0 to r-1{u[z] := a[k+j+mr*z]}radix_permute(u[], r, p)for z:=1 to r-1 // e**0 = 1{u[z] := u[z] * e**z}r_point_fft(u[], is)for z:=0 to r-1{a[k+j+mr*z] := u[z]}}Of course the loops that use the variable z have to be unrolled, the (length-px ) scratch space u[] has tobe replaced by explicit variables (e.g. u0, u1, ...

) and the r_point_fft(u[],is) shall be an inlinedpx -point FFT.With r = px there is a pitfall: if one uses the radix_permute() procedure instead of a radix-pxrevbin permute procedure (e.g. radix-2 revbin permute for a radix-4 FFT), some additional reordering isnecessary in the innermost loop: in the above pseudo code this is indicated by the radix_permute(u[],p)just before the p_point_fft(u[],is) line. One would not really use a call to a procedure, but changeindices in the loops where the a[z] are read/written for the DIT/DIF respectively.

In the code belowthe respective lines have the comment // (!).It is wise to extract the stage of the main loop where the exp()-function always has the value 1, which isthe case when ldm==1 in the outermost loop5 . In order not to restrict the possible array sizes to powersof px but only to powers of p one will supply adapted versions of the ldm==1 -loop: e.g. for a radix-4 DIFFFT append a radix 2 step after the main loop if the array size is not a power of 4.Code 1.8 (radix 4 DIT FFT) C++ code for a radix 4 DIF FFT on the array f[], the data length nmust be a power of 2, is must be +1 or -1:static const ulong RX = 4; // == rstatic const ulong LX = 2; // == log(r)/log(p) == log_2(r)voiddit4l_fft(Complex *f, ulong ldn, int is)// decimation in time radix 4 fft// ldn == log_2(n){double s2pi = ( is>0 ? 2.0*M_PI : -2.0*M_PI );const ulong n = (1<<ldn);revbin_permute(f, n);ulong ldm = (ldn&1); // == (log(n)/log(p)) % LXif ( ldm!=0 ) // n is not a power of 4, need a radix 2 step{for (ulong r=0; r<n; r+=2){Complex a0 = f[r];Complex a1 = f[r+1];5 cf.section 4.3.CHAPTER 1.

Характеристики

Тип файла

PDF-файл

Размер

1,52 Mb

Материал

Arndt - Algorithms for Programmers

Тип материала

Книга

Предмет

Численные методы

Учебное заведение

Неизвестно

Список файлов книги

arndt-algorithms-for-programmers-1304897859-1379237746.rar

Arndt - Algorithms for Programmers.pdf

Поделитесь ссылкой:

Ставлю 10/10
Все нравится, очень удобный сайт, помогает в учебе. Кроме этого, можно заработать самому, выставляя готовые учебные материалы на продажу здесь. Рейтинги и отзывы на преподавателей очень помогают сориентироваться в начале нового семестра. Спасибо за такую функцию. Ставлю максимальную оценку.

Лучшая платформа для успешной сдачи сессии
Познакомился со СтудИзбой благодаря своему другу, очень нравится интерфейс, количество доступных файлов, цена, в общем, все прекрасно. Даже сам продаю какие-то свои работы.

Студизба ван лав ❤
Очень офигенный сайт для студентов. Много полезных учебных материалов. Пользуюсь студизбой с октября 2021 года. Серьёзных нареканий нет. Хотелось бы, что бы ввели подписочную модель и сделали материалы дешевле 300 рублей в рамках подписки бесплатными.

Отличный сайт
Лично меня всё устраивает - и покупка, и продажа; и цены, и возможность предпросмотра куска файла, и обилие бесплатных файлов (в подборках по авторам, читай, ВУЗам и факультетам). Есть определённые баги, но всё решаемо, да и администраторы реагируют в течение суток.

Маленький отзыв о большом помощнике!
Студизба спасает в те моменты, когда сроки горят, а работ накопилось достаточно. Довольно удобный сайт с простой навигацией и огромным количеством материалов.

Студ. Изба как крупнейший сборник работ для студентов
Тут дофига бывает всего полезного. Печально, что бывают предметы по которым даже одного бесплатного решения нет, но это скорее вопрос к студентам. В остальном всё здорово.

Спасательный островок
Если уже не успеваешь разобраться или застрял на каком-то задание поможет тебе быстро и недорого решить твою проблему.

Всё и так отлично
Всё очень удобно. Особенно круто, что есть система бонусов и можно выводить остатки денег. Очень много качественных бесплатных файлов.

Отзыв о системе "Студизба"
Отличная платформа для распространения работ, востребованных студентами. Хорошо налаженная и качественная работа сайта, огромная база заданий и аудитория.

Отличный помощник
Отличный сайт с кучей полезных файлов, позволяющий найти много методичек / учебников / отзывов о вузах и преподователях.

Отлично помогает студентам в любой момент для решения трудных и незамедлительных задач
Хотелось бы больше конкретной информации о преподавателях. А так в принципе хороший сайт, всегда им пользуюсь и ни разу не было желания прекратить. Хороший сайт для помощи студентам, удобный и приятный интерфейс. Из недостатков можно выделить только отсутствия небольшого количества файлов.

Спасибо за шикарный сайт
Великолепный сайт на котором студент за не большие деньги может найти помощь с дз, проектами курсовыми, лабораторными, а также узнать отзывы на преподавателей и бесплатно скачать пособия.