Arndt - Algorithms for Programmers (523138), страница 28
Текст из файла (страница 28)
= highest_bit................1111111111111111 = highest_bit_01edge11111111111111111............... = highest_bit_10edge15 = highest_bit_idx................................ = low_zeros.............................111 = low_bits...............................1 = lowest_bit...............................1 = lowest_bit_01edge11111111111111111111111111111111 = lowest_bit_10edge0 = lowest_bit_idx.............................111 = lowest_block................1111....1111.11. = delete_lowest_bit............................1... = lowest_zero................1111....11111111 = set_lowest_zero................................ = high_bits1111111111111111................
= high_zeros1............................... = highest_zero1...............1111....1111.111 = set_highest_zero---------------------------------------------------------1111111111111111....1111....1... = 0xffff0f08 == word1............................... = highest_bit11111111111111111111111111111111 = highest_bit_01edge1............................... = highest_bit_10edge31 = highest_bit_idx.............................111 = low_zeros................................ = low_bits............................1... = lowest_bit............................1111 = lowest_bit_01edge11111111111111111111111111111...
= lowest_bit_10edge3 = lowest_bit_idx............................1... = lowest_block1111111111111111....1111........ = delete_lowest_bit...............................1 = lowest_zero162CHAPTER 8. SOME BIT WIZARDRY1631111111111111111....1111....1..1 = set_lowest_zero1111111111111111................ = high_bits................................ = high_zeros................1............... = highest_zero11111111111111111...1111....1... = set_highest_zero----------------------------------------------------------8.4Functions related to the base-2 logarithmThe following functions are taken from [FXT: file auxbit/bit2pow.h].The function ld that shall return blog2 (x)c can be implemented using the obvious algorithm:static inline ulong ld(ulong x)// returns k so that 2^k <= x < 2^(k+1)// if x==0 then 0 is returned (!){ulong k = 0;while ( x>>=1 ) { ++k; }return k;}And then ld is the same as highest_bit_idx, so one can usestatic inline ulong ld(ulong x){return highest_bit_idx(x);}Closely related are the functionsstatic inline int is_pow_of_2(ulong x)// return 1 if x == 0(!) or x == 2**k{return ((x & -x) == x);}andstatic inline int one_bit_q(ulong x)// return 1 iff x \in {1,2,4,8,16,...}{ulong m = x-1;return (((x^m)>>1) == m);}Occasionally useful in FFT based computations (where the length of the available FFTs is often restrictedto powers of two) arestatic inline ulong next_pow_of_2(ulong x)// return x if x=2**k// else return 2**ceil(log_2(x)){ulong n = 1UL<<ld(x); // n<=xif ( n==x ) return x;elsereturn n<<1;}andstatic inline ulong next_exp_of_2(ulong x)// return k if x=2**k// else return k+1{ulong ldx = ld(x);ulong n = 1UL<<ldx; // n<=xif ( n==x ) return ldx;elsereturn ldx+1;}CHAPTER 8.
SOME BIT WIZARDRY8.5164Counting the bits in a wordThe following functions are from [FXT: file auxbit/bitcount.h].If your CPU does not have a bit count instruction (sometimes called ‘population count’) then you mightuse an algorithm of the following typestatic inline ulong bit_count(ulong x)// return number of bits set{#if BITS_PER_LONG == 32x = (0x55555555 & x) + (0x55555555x = (0x33333333 & x) + (0x33333333x = (0x0f0f0f0f & x) + (0x0f0f0f0fx = (0x00ff00ff & x) + (0x00ff00ffx = (0x0000ffff & x) + (0x0000ffffreturn x;}&&&&&(x>> 1));(x>> 2));(x>> 4));(x>> 8));(x>>16));//////////0-2 in 2 bits0-4 in 4 bits0-8 in 8 bits0-16 in 16 bits0-31 in 32 bitsx = ((x>>1) & 0x55555555) + (x & 0x55555555);x = ((x>>2) & 0x33333333) + (x & 0x33333333);x = ((x>>4) + x) & 0x0f0f0f0f;x += x>> 8;x += x>>16;return x & 0xff;//////////0-2 in 2 bits0-4 in 4 bits0-8 in 4 bits0-16 in 8 bits0-32 in 8 bitswhich can be improved to eitherorx -= (x>>1) & 0x55555555;x = ((x>>2) & 0x33333333) + (x & 0x33333333);x = ((x>>4) + x) & 0x0f0f0f0f;x *= 0x01010101;return x>>24;(From [54].) Which one is better mainly depends on the speed of integer multiplication.For 64 bit CPUs the masks have to be adapted and one more step must be added (example correspondingto the second variant above):x = ((x>>1) & 0x5555555555555555) + (x & 0x5555555555555555);x = ((x>>2) & 0x3333333333333333) + (x & 0x3333333333333333);x = ((x>>4) + x) & 0x0f0f0f0f0f0f0f0f;x += x>> 8;x += x>>16;x += x>>32;return x & 0xff;////////////0-2 in 2 bits0-4 in 4 bits0-8 in 4 bits0-16 in 8 bits0-32 in 8 bits0-64 in 8 bitsWhen the word is known to have only a few bits set the following sparse count variant may be advantageousstatic inline ulong bit_count_sparse(ulong x)// return number of bits set// the loop will execute once for each bit of x set{if ( 0==x ) return 0;ulong n = 0;do { ++n; } while ( x &= (x-1) );return n;}More esoteric counting algorithms arestatic inline ulong bit_block_count(ulong x)// return number of bit blocks// e.g.:// ..1..11111...111.
-> 3// ...1..11111...111 -> 3// ......1.....1.1.. -> 3// .........111.1111 -> 2{return bit_count( (x^(x>>1)) ) / 2 + (x & 1);}CHAPTER 8. SOME BIT WIZARDRYstatic inline ulong bit_block_ge2_count(ulong x)// return number of bit blocks with at least 2 bits// e.g.:// ..1..11111...111. -> 2// ...1..11111...111 -> 2// ......1.....1.1.. -> 0// .........111.1111 -> 2{return bit_block_count( x & ( (x<<1) & (x>>1) ) );}The slightly weird algorithmstatic inline ulong bit_count_01(ulong x)// return number of bits in a word// for words of the special form 00...0001...11{ulong ct = 0;ulong a;#if BITS_PER_LONG == 64a = (x & (1UL<<32)) >> (32-5); // test bit 32x >>= a; ct += a;#endifa = (x & (1<<16)) >> (16-4); // test bit 16x >>= a; ct += a;a = (x & (1<<8)) >> (8-3); // test bit 8x >>= a; ct += a;a = (x & (1<<4)) >> (4-2); // test bit 4x >>= a; ct += a;a = (x & (1<<2)) >> (2-1); // test bit 2x >>= a; ct += a;a = (x & (1<<1)) >> (1-0); // test bit 1x >>= a; ct += a;ct += x & 1; // test bit 0return ct;}avoids all branches and may prove to be useful on a planet with pink air.8.6Swapping bits/blocks of a wordFunctions in this section are from [FXT: file auxbit/bitswap.h]Pairs of adjacent bits may be swapped viastatic inline ulong bit_swap_1(ulong x)// return x with neighbour bits swapped{#if BITS_PER_LONG == 32ulong m = 0x55555555;#else#if BITS_PER_LONG == 64ulong m = 0x5555555555555555;#endif#endifreturn ((x & m) << 1) | ((x & (~m)) >> 1);}(the 64 bit branch is omitted in the following examples).Groups of 2 bits are swapped bystatic inline ulong bit_swap_2(ulong x)// return x with groups of 2 bits swapped{ulong m = 0x33333333;return ((x & m) << 2) | ((x & (~m)) >> 2);}165CHAPTER 8.
SOME BIT WIZARDRY166Equivalently,static inline ulong bit_swap_4(ulong x)// return x with groups of 4 bits swapped{ulong m = 0x0f0f0f0f;return ((x & m) << 4) | ((x & (~m)) >> 4);}andstatic inline ulong bit_swap_8(ulong x)// return x with groups of 8 bits swapped{ulong m = 0x00ff00ff;return ((x & m) << 8) | ((x & (~m)) >> 8);}When swapping half-words (here for32bit architectures)static inline ulong bit_swap_16(ulong x)// return x with groups of 16 bits swapped{ulong m = 0x0000ffff;return ((x & m) << 16) | ((x & (m<<16)) >> 16);}gcc is clever enough to recognize that the whole thing is equivalent to a (left or right) word rotation andindeed emits just a single rotate instruction.The masks used in the above examples (and in many similar algorithms) can be replaced by arithmeticexpressions that render the preprocessor statements unnecessary. However, the code does not necessarilygain readability by doing so.Swapping two selected bits of a word goes likestatic inline void bit_swap(ulong &x, ulong k1, ulong k2)// swap bits k1 and k2// ok even if k1 == k2{ulong b1 = x & (1UL<<k1);ulong b2 = x & (1UL<<k2);x ^= (b1 ^ b2);x ^= (b1>>k1)<<k2;x ^= (b2>>k2)<<k1;}8.7Reversing the bits of a word.
. . when there is no corresponding CPU instruction can be achieved via the functions just described, cf.[FXT: file auxbit/revbin.h]Shown is a 32 bit version of revbin:static inline ulong revbin(ulong x)// return x with bitsequence reversed{x = bit_swap_1(x);x = bit_swap_2(x);x = bit_swap_4(x);#if defined BITS_USE_ASMx = asm_bswap(x);#elsex = bit_swap_8(x);x = bit_swap_16(x);#endifreturn x;}CHAPTER 8. SOME BIT WIZARDRY167Here, the last two steps that correspond to a byte-reverse are replaced by the CPU instruction if available.For 64 bit machines a x = bit_swap_32(x); would have to be inserted at the end (and possibly a bswapbranch entered that can replace the last three bit_swaps).One can generate the masks used in the process:static inline ulong revbin(ulong x){ulong s = BITS_PER_LONG >> 1;ulong m = ~0UL >> s;while ( s ){x = ( (x & m) << s ) ^ ( (x & (~m)) >> s );s >>= 1;m ^= (m<<s);}return x;}Note that the above function is pretty expensive and it is not even clear whether it beats the obviousalgorithm,static inline ulong revbin(ulong x){ulong r = 0, ldn = BITS_PER_LONG;while ( ldn-- != 0 ){r <<= 1;r += (x&1);x >>= 1;}return r;}especially on 32 bit machines.Therefore the functionstatic inline ulong revbin(ulong x, ulong ldn)// return word with the last ldn bits//(i.e.
bit_0 ... bit_{ldn-1})//of x reversed// the other bits are set to 0{return revbin(x) >> (BITS_PER_LONG-ldn);}should only be used when ldn is not too small, else replaced by the trivial algorithm.For practical computations the bit-reversed words usually have to be generated in the (reversed) countingorder and there is a significantly cheaper way to do the update:static inline ulong revbin_update(ulong r, ulong ldn)// let r = revbin(x, ld(n)) at entry// then return revbin(x+1, ld(n)){ldn >>= 1;while ( !((r^=ldn)&ldn) ) ldn >>= 1;return r;}8.8Generating bit combinationsThe following functions are taken from [FXT: file auxbit/bitcombcolex.h] and [FXT: fileauxbit/bitcomblex.h]The ideas above can be used for the generation of bit combinations in colex order:static inline ulong next_colex_comb(ulong x)CHAPTER 8.