Morgan - Numerical Methods (523161), страница 10
Текст из файла (страница 10)
Be careful;this problem can go undetected for a long time.In multiprecision multiplication, the use of IMUL and IDIV is often impractical,because the operation treats the large numbers as polynomials, breaking them apartinto smaller units, or coefficients. These instructions handle all numbers as signedwith 2n-1 significant bits, where n is the size of the data type. This inevitably producesan incorrect result because the instructions can only handle word operands in therange -32,768 to 32,767 and byte operands ranging from -128 to 127, with the MSBof each word or byte treated as a sign bit. Multiplying the numbers 1283H and 1234Hwill result in one subproduct that is out of range and an incorrect product because anyof the submultiplies that involve 83H will incorrectly interpret it as a signed number.A foolproof way to work with signed multiplies and divides, either single- ormultiprecision, is to check the operands for a sign before the multiply or divide.
Youthen handle the operation as unsigned by two’s-complementing any negativeoperands. If necessary, the result can be two’s-complemented at the end of theprocedure. The algorithm is shown in pseudocode, and the code fragment is anexample of how it might be implemented.sign_operation: Algorithm1.Declare and clear a byte variable, sign.2. Check the sign of the first operand to see if it's negative.If not, go to step 3.If so, complement sign, then complement the operand.3.
Check the sign of the second operand to see if it's negative.If not, go to step 4.If so, complement sign, then complement the operand.44INTEGERS4.Perform the multiply or divide.5.Check ign.If it's zero, you're done.If it's -1 (0ffH), two's-complement the result and go home.signed-operation: Listing.******signed-operationlocalmovorjnsnotnotnegjcaddcheck-second:movorjnsnotnotnegjcadddone_with_check:proc operand0:dword, operandl:dword, result:wordsign:byteax, word ptr operand0L21=, ax;if not sign, it is positivecheck-secondbyte ptr sign;two's complement of operandword ptr operand0[2]word ptr operand0check-secondword ptr operand0[2],1ax, word ptr operand1[2]ax, axdone-with-checkbyte ptr signword ptr operandl[2]word ptr operand1done_with_checkword ptr operandl[2],1;perform operation hereon_the_way_out:movorjnsmovnotal, byte ptr signal, alall-donesi, word ptr resultword ptr si[6]45NUMERICAL METHODSnotnotnegjcaddadcadcall_done:word ptrword ptrword ptrall_doneword ptrword ptrword ptrsi[4]si[2]si[0]si[2], 1si[4], 0si[6], 0Adding this technique to one of those described below will make it a signedprocess.Binary MultiplicationMultiplication in a binary system may generally be represented as the multiplication of polynomials, with the algorithm handling each bit, byte, or word as acoefficient of the power of the bits position or the least significant position within thatword or byte:*an * 2n + .
. . a1 * 2l + a0 * 2Obn * 2n+ ... b1 * 2l + b0 * 20bn *(a) * 2n + ...b1 *(a) * 2l + b0*(a) * 20where n = the bit position. It is the same for bytes and words except that n is then thepower of the least significant bit within the word or byte:12345678H = 1234H * 164 + 5678H * 160 = 1234H * 216 + 5678H * 20In the following example involving the multiplication of two 4-bit quantities,you may recognize the pencil-and-paper method you learned in school:Step 1:46a3x23 + a2x22 + a1x21 + a0x20*b3x23 + b2x22 + b1x21 + b0x20b0 * a3 + b0 * a2 + b0 * a1 + b0 * a0INTEGERSStep 2:a3x23 + a2x22 + a1x21 + a0x20*b3x23 + b2x22 + b1x21 + b0x20b0 * a3 + b0 * a2 + b0 * a1 + b0 * a0b1 * a3 t b1 * a2 + b1 * a1 + b1 * a0Step 3:a3x23 + a2x22 + a1x21 + a0x20* b3x23 + b2x22 + b1x21 + b0x20b0 * a3 + b0 * a2 + b0 * a1 + b0 * a0b1 * a3 + b1 * a2 + b1 * a1 + b1 * a0b2 * a3 + b2 * a2 + b2 * a1 + b2 * a0Step 4:a3x23 + a2x22 + a1x21 + a0x20* b3x23 + b2x22 + b1x21 + b0x20b0 * a3 + b0 * a2 + b0 * a1 + b0 * a0b1 * a3 + b1 * a2 + b1 * a1 + b1 * a0b2 * a3 + b2 * a2 + b2 * a1 + b2 * a0b3 * a3 + b3 * a2 + b3 * a1 + b3 * a0b3 * a3 + ((b2 * a3)+ (b3 * a2))((b0 * a1)+ (b0 * a1) + (b1 * a0))+ b0 * a0An example of this in a four-bit multiply could be shown as:1100=12D1101=13D110000001100110010011100=156*This is also how the basic shift-and-add algorithm for microprocessors iswritten.
This procedure is taken directly from the positional number theory, whichsimply states that the value of a bit or integer within a number depends on its position.Thus, each pass through the algorithm shifts both the multiplier and the multiplicandthrough their corresponding positions, adding the multiplicand to the result if themultiplier has a one in the 0 th position.
(The right shift is arithmetic; that is, a zerois shifted into the MSB.) As with the pencil-and-paper method, the multiplicand isrotated left and the multiplier is rotated right.To demonstrate, let’s multiply two numbers, 1100 (12D) and 1101 (13D). We47NUMERICAL METHODSmust first designate one as the multiplicand and the other as the multiplier and set upregisters to hold them.
We also need a loop counter to indicate when we have passedthrough all the bit positions of the multiplier. We can call this variable cntr (counter)and a variable to hold the product prdct. We’ll call 1100 (the multiplicand) mltpndand 1101 (the multiplier) mltpr. In the following example, the values in parenthesesare all decimal:0. mltpnd = 1100 (12)mltpr = 1101 (13)cntr = 100 (4)prdct = 0Then, with each pass through the algorithm, the results are:1. mltpnd = 11000 (24)mltpr = 0110 (6)cntr = 011 (3)prdct = 1100 (12)2. mltpnd = 110000 (48)mltpr = 0011 (3)cntr = 010 (2)prdct = 1100 (12)3.
mltpnd = 1100000 (96)mltpr = 0001 ( 1)cntr = 1 (1)prdct = 111100 (60)4. mltpnd = 11000000 (192)mltpr = 0000 (0)cntr = 00 (0)prdct = 10011100 (156)48INTEGERSThe following routine is based on this algorithm but expects 32-bit operands.cmul: Algorithm1.Allocate enough space to store multiplicand and allow for 32 left shifts,set the variable numbits to 32, and see that the registers where productis formed contain zeros. (Be certain to provide enough storage for theoutput, at most Product_bits = Multiplicand_bits + Multiplier_bits.Here, 4 Multiplicand_bits+ 4 Multiplier_bits = 8 Product bits.)2. Shift multiplier right one position and check for a carry.If there is not a carry, go to step 3.If there is, add the current value in mltpcnd to the product registers.3.Shift mltpcnd left one position and decrement the counter variablenumbits.
Test numbits for zero.If it's zero, go to step 4.If not, return to step 2.4. Write the product registers to product and go home.cmul: Listing; ******; classic multiplycmul proc uses bx cx dx si di, multiplicand:dword, multiplier:dword,product:wordlocalnumbits:byte, mltpcnd:qwordpushfcldsubax, axleas1, word ptr multiplicandleadi, word ptr mltpcndmovcx, 2movswrepstosw;clear upper wordsstoswmovbx, ax;clear register to be used to form productcx, axmovdx, axbyte ptr numbits, 3249NUMERICAL METHODStest-multiplier:shrrcrjncaddadcadcadcdecrement_counter:shlrclrclrcldecjnzexit:movmovmovmovmovpopfretcmulendpword ptr multiplier[2], 1word ptr multiplier, 1decrement -counterax, word ptr mltpcndbx, word ptr mltpcnd[2]cx, word ptr mltpcnd[4]dx, word ptr mltpcnd[6]word ptr mltpcnd, 1word ptr mltpcnd[2], 1word ptr mltpcnd[4], 1word ptr mltpcnd[6], 1byte ptr numbitstest-multiplierdi, wordword ptrword ptrword ptrword ptrptr product[di], ax[di] [2], bx[di][4], cx[di][6], dxOne possible variation of this example is to employ the “early-out” method.
Thistechnique doesn’t use a counter to track the multiply but checks the multiplier forzero each time through the loop. If it’s zero, you’re done. For examples of early-outtermination, see the routines in the section “Skipping Ones and Zeros” and others inFXMATH.ASM included on the accompanying disk.A Faster Shift and AddThe same operation can be performed faster and in a smaller space. For one thing,the shifts being done on the multiplicand and multiplier result in unnecessary doubleprecision additions. Eliminating any unnecessary additions saves time and space.Arranging any shifts so that they are all in the same direction, means fewer registersor memory variables.As you may recall, positional notation lends itself quite nicely to polynomial50INTEGERSinterpretation.
Using a binary byte as an example, let’s say we have two numbers, aand b:a3*23 + a2*22 + a1*21+ a0*20 = aandb3*23+ b2*22+ b1*21+ b0*20= bWhen we multiply them, we get:b3*(a3*23+ a2*22+ a1*21+ a0*a0)* 23+ b2* (a3*2 + a2*22+ a1*21+a0* 20) * 22+ b1 * (a3* 23+ a2*22 + a1*21+ a0*20)* 21+ b0* (a3*23+ a2*22+ a1*21+ a0x*20)* 20= a * bAssuming an initial division by 24 produces a fraction:a * b = [b3*(a*2-1)+ b2 * (a*2-2)+ b1* (a*2-3) + b0 * (a*2-4)] *10000HNow we can arrive at the same result as in the previous shift-and-add operationusing only right shifts.In cmul2, we’ll be using the multiplicand as the product as well.