Morgan - Numerical Methods (523161), страница 24
Текст из файла (страница 24)
This accounts for any overflows in the result.8.Replace the exponent, set the sign, and exit.9.Infinity exit.10. Zero exit.flmul: Listing; *****flmul proclocalreppushfstdsubmovleamovstoswc uses bx cx dx si di,fpO:qword, fpl:qword, rptr:wordresult[8]:word, sign:byte, exponent:byteax,axbyte ptr sign,aldi,word ptr result[14]cx,8;clear sign variable;and result variable;lealeamovshlandjnejmpis_a_inf:cmpjnejmpis_b_zero:movshland148si,word ptr fp0bx,word ptr fp1ax,word ptr [si][4]ax,1ax,0ff00his_a_infmake_zero;name a pointer to each fp;check for zero;zero exponentax, 0ffOOhis b zeroreturn_infinite;multiplicand is infinitedx,word ptr [bx][4]dx,1dx,0ff00h;check for zeroFLOATING-POINTjnzjmpis_b_inf:cmpjnejmp;get_exp:subaddmov;movorjnsnota_plus:movorjnsnotARITHMETICis b infmake_zero;zero exponentdx,0ff00hget_expreturn infinite;multiplicand is infiniteah, 77hah, dhbyte ptr exponent,ah;add exponents;save exponentdx,word ptr [si][4]dx, dxa_plusbyte ptr sign;set sign variable according;to msb of floatdx,word ptr [bx][4];set sign according to msb;of floatdx, dxrestore missing bitbyte ptr signrestore missing bit:andword ptr fp0[4], 7fhorword ptr fp0[4], 80handword ptr fp1[4], 7fhorword ptr fp1[4], 80h;remove the sign and exponent;and restore the hidden bitinvokemu164a, fp0, fpl, addr result;multiply with fixed point;routinemovmovmovdx, word ptr result [10]bx, word, ptr result[8]ax, word, ptr results[6];check for zeros on returnsubcmpjnecmpjnecmpjnejmpcx, cxax,cxnot_zerobx,cxnot_zerodx,cxnot_zerofix_sign;exit with a zero149NUMERICAL METHODSnot_zero:movcx,64dx,0hcmprotate_result_leftjedh,00hcmpjnerotate_result_righttestd1,80hrotate_result_leftjeshort_done_rotatejmprotate result right:shrdx,lrcr bx,lrcrax,1testdx,0ff00hdone_rotatejeincbyte ptr exponentlooprotate_result_rightrotate_result_left:shlword ptr result[2],1rclword ptr result[4],1rclax,1rcl bx,lrcldx,ltestdx,80hjnedone rotatedecbyte ptr exponentloopdone_rotate:andormovorjeorfix_sign:movmovmovmov150;should never go to zero;realign float;decrement exponent with;each shift;decrement exponent with;each shiftrotate result leftdx,7fhshl dx, 1dh, byte ptr exponentshr dx, 1cl,byte ptr sign;clear sign bit;insert exponent;set sign of float based on;sign flagcl,clfix-signdx,8000hdi,word ptr rptrword ptr [di], axword ptr [di][2],bxword ptr [di][4],dx;write to the outputFLOATING-POINT ARITHMETICsubmovfp_mulex:popfret;return infinite:submovnotmovandjmpmake_zero:xorfinish_error:movaddmovstosrepjmpflmul endpax,axword ptr [di][6],axax, axbx, axaxdx, axdx, 0f80hshort fix_sign;infinityax,axdi,word ptr rptrdi,6cx, 4word ptr [di]short fp_mulexThe multiplication in this routine was performed by the fixed-point routinemul64a.
This is a specially-written form of mul64 which appears in FXMATH.ASMon the included disk. It takes as operands, 5-byte integers, the size of the mantissaplus extended bits in this format, and returns a 10-byte result. Knowing the size ofthe operands, means the routine can be sized exactly for that result, making it fasterand smaller.mul64a: Algorithm1.Use DI to hold the address of the result, a quadword.2.Move the most significant word of the multiplicand into AX and multiplyby the most significant word of the multiplier.
The product of thismultiplication is written to the most significant word result.3.The most significant word of the multiplicand is returned to AX andmultiplied by the second most significant word of the multiplier. Theleast significant word of the product is MOVed to the second mostsignificant word of result, the most significant word of the product is151NUMERICAL METHODSADDed to the most significant word of result.4.The most significant word of the multiplicand is returned to AX andmultiplied by the least significant word of the multiplier. The leastsignificant word of this product is MOVed to the third most significantword of result, the most significant word of the product is ADDed to thesecond most significant word of result, any carries are propagatedthrough with an ADC instruction.5.The second most significant word of the multiplicand is MOVed to AX andmultiplied by the most significant word of the multiplier.
The lower wordof the product is ADDed to the second most significant word of resultand the upper word is added-with-carry (ADC) to the second mostsignificant word of result.6.The second most significant word of the multiplicand is again MOVed toAX and multiplied by the secondmost significant word of the multiplier.The lower word of the product is ADDed to the third most significant wordof result and the upper word is added-with-carry (ADC) to the secondmostsignificant word of result with any carries propagated to the MSW withan ADC.7.The second most significant word of the multiplicand is again MOVed toAX and multiplied by the least significant word of the multiplier. Thelower word of the product is MOVed to the fourth most significant wordof result and the upper word is added-with-carry (AX) to the thirdmostsignificant word of result with any carries propagated through to theMSW with an ADC.8. The least significant word of the multiplicand is MOVed into AX andmultiplied by the MSW of the multiplier.
The least significant word ofthis product is ADDed to the third most significant word of result, theMSW of the product is ADCed to the second most significant word of result,and any carry is propagated into the most significant word of result withan ADC.mul64a: Listing; *****;* mu164a - Multiplies two unsigned 5-byte integers. The;* procedure allows for a product of twice the length of the multipliers,;* thus preventing overflows.mu164a proc uses ax dx,multiplicand:qword, multiplier:qword, result:wordmovsub152di,word ptr resultcx, cxFLOATING-POINT ARITHMETICmovmulmovax, word ptr multiplicand[4]word ptr multiplier [4]word ptr [di][8], ax;multiply multiplicand high;word by multiplier high wordmovmulax, word ptr multiplicand[4]word ptr multiplier [2];multiply multiplicand high;word by second MSW;of multipliermovaddword ptr [di][6], axword ptr [di][8], dxmovax, word ptr multiplicand[4]mulmovaddadcwordwordwordwordmovmuladdadcax, wordword ptrword ptrword ptrptr multiplicand[2]multiplier [4][di][6], ax[di][8], dx;multiply second MSW;of multiplicand by MSW;of multipliermovmuladdadcadcax, wordword ptrword ptrword ptrword ptrptr multiplicand[2]multiplier[2][di][4], ax[di][6], dx[di][8], cx;multiply second MSW of;multiplicand by second MSW;of multipliermovmulmovax, word ptr multiplicand[2]word ptr multiplier[0]word ptr [di][2], axaddadcadcword ptr [di][4], dxword ptr [di][6], cxword ptr [di][8], cxmovmuladdadcadcax, wordword ptrword ptrword ptrword ptrptr multiplicand[0]multiplier[4][di][4], ax[di][6], dx[di] [8], cx;multiply multiplicand low;word by MSW of multipliermovax, word ptr multiplicand[0];multiply multiplicand lowptrptrptrptrmultiplier [0][di] [4], ax[di][6], dx[di][8], cx;multiply multiplicand high;word by third MSW;of multiplier;propagate carry;add any remnant carry;multiply second MSW of;multiplicand by least;significant word of;multiplier;add any remnant carry;add any remnant carry153NUMERICAL METHODSmulword ptr multiplier[2]addadcadcadcwordwordwordwordmovmulmovaddadcadcadcax, wordword ptrword ptrword ptrword ptrword ptrword ptrptrptrptrptr[di][2],[di][4],[di][6],[di][8],axdxcxcxptr multiplicand[0]multiplier[0][di][0], ax[di][2], dx[di][4], cx[di][61, cx[di][8], cx;word by second MSW of;multiplier;add any remnant carry;add any remnant carry;multiply multiplicand low;word by multiplier low word;add any remnant carry;add any remnant carry;add any remnant carryretmu164a endpFLDIVThe divide is similar to the multiply except that the exponents are subtractedinstead of added and the alignment is adjusted just before the fixed-point divide.
Thisadjustment prevents an overflow in the divide that could cause the most significantword to contain a one. If we divide by two and increment the exponent, div64 returnsa quotient that is properly aligned for the renormalization process that follows.The division could have been left as it was and the renormalization changed, butsince it made little difference in code size or speed, it was left.
This extra divisiondoes not change the result.fldiv: Algorithm1.Check the operands for zero and infinity.If one is found to be infinite, exit through step 11.If one is found to be zero, exit through step 12.2.Test the divisor and dividend to see whether they are equal. If they are,exit now with a floating-point 1.0 as the result.3.Get the exponents, find the difference and subtract 77H (119D). This isthe approximate exponent of the result.154FLOATING-POINT ARITHMETIC4.Check the signs of the operands and set the sign variable accordingly.5.Restore the hidden bit.6.Check the dividend to see if the most significant word is less than thedivisor to align the quotient. If it's greater, divide it by two andincrement the difference between the exponents by one.7.Use div64 to perform the division.8.Check for a zero result upon return and exit with a floating-point 0.0if so.9.
Renormalize the result.10. Insert the exponent and sign and exit.11. Infinite exit.12. Zero exit.fldiv: Listing; *****fldivprocCuses bx cx dx si di,fp0:gword, fpl:qword, rptr:wordlocalqtnt:qword, sign:byte, exponent:byte, rmndr:gwordpushfstdxorax,axmovbyte ptr sign, allealeasi,word ptr fp0bx,word ptr fplmovshlandjnejmpax,word ptr [si][4]ax,1ax,0ff00hchk_breturn infinitemovshlandjnedx,word ptr [bx][4]a,1dx,0ff00hb_notz;begin error and situation;checking;name a pointer to each fp;check for zero:infinitychk_b:155NUMERICAL METHODSjmpdivide-by-zero;infinity, divide by zero;is undefinedb_notz:cmpjnejmpcheck_identity:movaddaddmovcmpswrepejnemovmovmovmovmovmovmovsubmovjmpnot same:lealeadx,0ff00hcheck_identitymake_zerodi,bxdi,4si,4cx,3not sameax,word ptr dgt[8]bx,word ptr dgt[10]dx,word ptr dgt[12]di,word ptr rptrword ptr [di],axword ptr [di][2],bxword ptr [di][4],dxax,axword ptr [di][6],axfldivexsi,word ptr fp0bx,word ptr fp1subaddah,dhah,77hmovbyte ptr exponent,ahmovorjnsnotdx, word ptr [si][4]h, dxa_plusbyte ptr signmovorjnsnotdx,word ptr [bx][4]dx, dxrestore_missing_bitbyte ptr sign;divisor is infinite;will decrement selves;these guys are the same;return a one;get exponents;reset pointerssubtract exponents;subtract bias minus two;digits;save exponent;check signa_plus:restore_missing_bit:156;line up operands for divisionFLOATING-POINT ARITHMETICandormovandorcmpjaincshlrclrclstore_dvsr:movword ptr fp0[4], 7fhword ptr fp0[4], 80hdx, word ptr fp1[4]dx, 7fhdx, 80hdx,word ptr fp0[4]store dvsrbyte ptr exponentword ptr fp1[0], 1word ptr fp1[2], 1dx, 1;see if divisor is greater than;dividend then divide by 2word ptr fp1[41, dxdivide:invokedivmul, fp0, fpl, addr fp0movdx,word ptr fp0[4]movbx,word ptr fp0[2]movax,word ptr fp0[0]subcx, cxcmpax,cxjnenot zerobx,cxcmpjnenot zerocmpdx,cxjnenot zerofix-signjmpnot_zero:movcx,64cmpdx,0hrotate_result_leftjecmpd h , 0 0 hjnerotate_result_righttestd1,80hrotate_result_leftjeshort done_rotatejmprotate_result_right:shrdx,1rcrbx,1rcrax,1testdx,0ff00hdone_rotateje;perform fixed point division;check for zeros on return;exit with a zero;should never go to zero;realign float157NUMERICAL METHODSinclooprotate_result_left:shlrclrclrcltestjnedecloopdone rotate:andshlorshrmovorjeorfix_sign:movmovmovmovsubmovfldivex:popfretreturn infinite:submovnotmovandjmpdivide_by_zero:subnot158byte ptr exponent;decrement exponent with;each shiftrotate result rightword ptr qtnt,lax,1bx,ldx,ldx,80hdone_rotatebyte ptr exponent;decrement exponent with;each shiftrotate_result_leftdx, 7fhdx, 1dh, byte ptr exponentdx, 1cl,byte ptr sign;insert exponent;set sign flag according;to variablecl,clfix_signdx,8000hdi,word ptr rptrword ptr [di],axword ptr [di][2],bxword ptr [di][4],dxax,axword ptr [di][6l,axax, axbx, axaxdx, axdx, 0f80hshort fix_signax,axax;infinityFLOATING-POINT ARITHMETICjmpmake_zero:xorfinish_error:movaddmovstosrepjmpfldiv endpshort finish errorax,ax;positive zerodi,word ptr rptrdi,6cx,4word ptr [di]short fldivexIn order to produce the accuracy required for this floating-point routine with thegreatest speed, use div64 from Chapter 2.