Arndt - Algorithms for Programmers (523138), страница 34
Текст из файла (страница 34)
.The yellow-code might be a good candidate for ‘randomization’ of binary words.The blue-code maps any range [0 . . . 2k − 1] onto itself. Both blue-code and yellow-code are involutions(self-inverse).The transformsinline ulong red_code(ulong a){ulong s = BITS_PER_LONG >> 1;ulong m = ~0UL >> s;while ( s )CHAPTER 8. SOME BIT WIZARDRY{ulong u = a & m;ulong v = a ^ u;a = v ^ (u<<s);a ^= (v>>s);s >>= 1;m ^= (m<<s);}return}194a;andulong s = BITS_PER_LONG >> 1;ulong m = ~0UL << s;while ( s ){ulong u = a & m;ulong v = a ^ u;a = v ^ (u>>s);a ^= (v<<s);s >>= 1;m ^= (m>>s);}return a;look like0:1:2:3:4:5:6:7:8:9:10:11:12:13:14:15:16:17:18:19:20:21:22:23:24:25:26:27:28:29:30:31:r=................................r=1...............................r=11..............................r=.1..............................r=1.1.............................r=..1.............................r=.11.............................r=111.............................r=1111............................r=.111............................r=..11............................r=1.11............................r=.1.1............................r=11.1............................r=1..1............................r=...1............................r=1...1...........................r=....1...........................r=.1..1...........................r=11..1...........................r=..1.1...........................r=1.1.1...........................r=111.1...........................r=.11.1...........................r=.1111...........................r=11111...........................r=1.111...........................r=..111...........................r=11.11...........................r=.1.11...........................r=...11...........................r=1..11...........................01212123432323212123234345434323c=................................c=11111111111111111111111111111111c=.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1c=1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.c=..11..11..11..11..11..11..11..11c=11..11..11..11..11..11..11..11..c=.11..11..11..11..11..11..11..11.c=1..11..11..11..11..11..11..11..1c=...1...1...1...1...1...1...1...1c=111.111.111.111.111.111.111.111.c=.1...1...1...1...1...1...1...1..c=1.111.111.111.111.111.111.111.11c=..1...1...1...1...1...1...1...1.c=11.111.111.111.111.111.111.111.1c=.111.111.111.111.111.111.111.111c=1...1...1...1...1...1...1...1...c=....1111....1111....1111....1111c=1111....1111....1111....1111....c=.1.11.1..1.11.1..1.11.1..1.11.1.c=1.1..1.11.1..1.11.1..1.11.1..1.1c=..1111....1111....1111....1111..c=11....1111....1111....1111....11c=.11.1..1.11.1..1.11.1..1.11.1..1c=1..1.11.1..1.11.1..1.11.1..1.11.c=...1111....1111....1111....1111.c=111....1111....1111....1111....1c=.1..1.11.1..1.11.1..1.11.1..1.11c=1.11.1..1.11.1..1.11.1..1.11.1..c=..1.11.1..1.11.1..1.11.1..1.11.1c=11.1..1.11.1..1.11.1..1.11.1..1.c=.1111....1111....1111....1111...c=1....1111....1111....1111....11103216161616161682482482424816161616161616161616161616161616[FXT: file demo/bittransforms-red-demo.cc]Relations between the transformsWe write B for the blue-code (transform), Y for the yellow-code and r for bit-reversal (the revbinfunction).
Then B and Y are connected by the relationsB=Y rY(8.1)YB==BrBrY r(8.2)(8.3)Yr==rBrY BY(8.4)(8.5)r=BY B(8.6)CHAPTER 8. SOME BIT WIZARDRY195As said, B and Y are self-inverse:B −1Y −1==BYB B = idY Y = id(8.7)(8.8)The red-code and the cyan-code are not involutions (‘square roots of identity’) but third roots of identity(Using R for the red-code, C for the cyan-code):RRR=CCCRC==R−1 = R R = CidC −1 = C C = RC R = id(8.9)(8.10)(8.11)RC(8.12)(8.13)idBy construction==rBrYSimilar inter-relations as for B and Y hold for R and C:RCRCRC======CrCRrRrCrrRrRC RC RC(8.14)(8.15)(8.16)(8.17)(8.18)(8.19)Y R = RB = BC = C Y(8.20)==RY = Y C = RBR = C BCC B = BR = RY R = C Y C(8.21)(8.22)RC==(8.23)(8.24)One hasr=FurtherBYBY =BCB =Y CYY B = BRB = Y RYid =BY C = RY B(8.25)id =id =C BY = BRYY CB =Y BR(8.26)(8.27)The multiplication table lists Z = Y X.
The R in the third column of the second row says that r B = R.The letter i is used for identity (id). An asterisk says that X Y = Y X.irBYRCiri* r*r* i*B* CY* RR* YC* BBB*Ri*CrYYY*CRi*BrRR*BYrC*i*CC*YrBi*R*CHAPTER 8. SOME BIT WIZARDRY196Relations to Gray code and green codeWrite g for the Gray code, then:gBgBgBg−1g B g −1gB= id= B= B(8.28)(8.29)(8.30)= B g −1(8.31)Let Sk be the operator that rotates a word by k bits (bit zero is moved to position k, use [FXT:bit rotate sgn in auxbit/bitrotate.h]) thenY S+1 YY S−1 YY Sk Y===gg −1gk(8.32)(8.33)(8.34)‘Shift in the frequency domain is derivative in time domain’.Let e be the green code operator, thenB S+1 BB S−1 BB Sk B===e−1ee−k(8.35)(8.36)(8.37)More transforms by symbolic poweringThe idea of powering a transform (as done for the Gray code in section 8.22) can be applied to the‘color’-transforms as exemplified for the blue-code:inline ulong blue_xcode(ulong a, ulong x){x &= (BITS_PER_LONG-1); // modulo BITS_PER_LONGulong s = BITS_PER_LONG >> 1;ulong m = ~0UL << s;while ( s ){if ( x & 1 ) a ^= ( (a&m) >> s );x >>= 1;s >>= 1;m ^= (m>>s);}return a;}The result is not the power of the blue-code which would be pretty boring as B B = id.
Instead thetransform (and the equivalents for Y , R and C, see [FXT: file auxbit/bitxtransforms.h]) are moreinteresting: All relations between the transforms are still valid, if the symbolic exponent is identicalwith all terms. For example, we had B B = id, now B x B x = id is true for all x (there are essentially BITS_PER_LONG different x). Similarly, C C = R now has to be C x C x = Rx . That is, we haveBITS_PER_LONG different versions of our four transforms that share their properties with the ‘simple’versions. Among them BITS_PER_LONG transforms B x and Y x that are involutions and C x and Rx thatare third roots of the identity: C x C x C x = Rx Rx Rx = id.While not powers of the simple versions, we still have B 0 = Y 0 = R0 = C 0 = id.
Further, let e be the‘exponent’ of all ones and Z be any of the transforms, then Z e = Z, Writing ‘+’ for the XOR operation,then Z x Z y = Z x+y and so Z x Z y = Z whenever x + y = e.CHAPTER 8. SOME BIT WIZARDRY197The building blocks of the transformsConsider the following transforms on two-bit words whereXOR)·¸·¸1 0aid2 v =0 1b·¸·¸0 1ar2 v =1 0b·¸·¸1 1aB2 v =0 1b·¸·¸1 0aY2 v =1 1b·¸·¸0 1aR2 v =1 1b·¸·¸1 1aC2 v =1 0baddition is over GF (2) (that is, addition is·=·=·=·=·=·=abba¸(8.38)¸a+bbaa+bba+ba+ba(8.39)¸(8.40)¸(8.41)¸(8.42)¸(8.43)It can easily be verified that for these the same relations hold as for id, r, B, Y , R, C.
In fact the‘color-transforms’ and (trivially) id are the transforms obtained by the repeated Kronecker-products ofthe matrices.The corresponding version of the bit-reversal is [FXT: xrevbin in auxbit/revbin.h]:inline ulong xrevbin(ulong a, ulong x){x &= (BITS_PER_LONG-1); // modulo BITS_PER_LONGulong s = BITS_PER_LONG >> 1;ulong m = ~0UL >> s;while ( s ){if ( x & 1 ) a = ( (a & m) << s ) ^ ( (a & (~m)) >> s );x >>= 1;s >>= 1;m ^= (m<<s);}return a;}Then, for example Rx = rx B x (cf. 8.12).Talking about the blue-code, the symbolic powering is equivalent to selecting individual levels of theReed-Muller transform (section 5.8).
Y , R and C are variants that transform.The symbolic-powering idea can be carried further. The values of the mask m in the blue-code for thesuccessive stages are (for 32-bit words)1111111111111111................11111111........11111111........1111....1111....1111....1111....11..11..11..11..11..11..11..11..1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.Considering the lowest bit of each block...............1.......................1...............1...........1.......1.......1.......1.....1...1...1...1...1...1...1...1..1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.it is obvious that one could use all bits of a word to select the blocks where the ‘atomic’ transform shouldbe applied.CHAPTER 8.
SOME BIT WIZARDRY8.24198CPU instructions often missedEssential• A bit-reverse instruction. Every general purpose CPU should have one. Considering that thenecessary circuitry is the most simple imaginable it is hard to understand CPU manufacturers tendto ‘forget’ it.• A parity bit for the complete machine word. The parity of a word is the number of bits modulotwo, not the complement of it.
Even better would be a instruction for the inverse Gray-code whichcan (among other things) serve as building block for fast transforms over GF (2).• A bit-count instruction. Clearly, this would also give the parity at bit zero.• Primitives for permutations of bits. The alpha processor has such. A instruction using the equivalentof the idea presented at the end of section 8.23 might be the most powerful approach.Nice to have• A fast conversion from integer to float and double (both directions).• Multiplication corresponding to XOR as addition. That is, a multiplication without carries, theone used for polynomials over GF (2).
See [FXT: bitpol mult in auxbit/bitpol.h].• A bit-zip and a bit-unzip instruction, see [FXT: file auxbit/bitzip.h].• A bit-gather and a bit-scatter instruction, see [FXT: file auxbit/bitgather.h]. This would includebit-zip and its inverse.Luxury items• The mentioned multiplication for polynomials over GF (2) with a modulus, see [FXT:bitpolmod mult in auxbit/bitpolmodmult.h].• Instructions for the multiplication of complex floating point numbers.• Primitives for the fast orthogonal transforms like the radix -two, -four and possibly -eight butterfliesfor the Walsh and Hartley transform.• Multiplication modulo the prime 264 − 232 + 1.Chapter 9Sorting and searchingTBD: chapter outlineTBD: counting sort, radix sort, merge sort9.1SortingThere are a few straight forward algorithms for sorting that scale with ∼ n2 (where n is the size of thearray to be sorted).Here we use selection sort whose idea is to find the minimum of the array, swap it with the first elementand repeat for all elements but the first:template <typename Type>void selection_sort(Type *f, ulong n){for (ulong i=0; i<n; ++i){Type v = f[i];ulong m = i; // position of minimumulong j = n;while ( --j > i ) // search (index of) minimum{if ( f[j]<v ){m = j;v = f[m];}}swap(f[i], f[m]);}}A verification routine is always handy:template <typename Type>int is_sorted(const Type *f, ulong n){if ( 0==n ) return 1;while ( --n ) // n-1 ...