Morgan - Numerical Methods (523161), страница 22
Текст из файла (страница 22)
These are especially useful in determining whataction to take during addition and subtraction.4. Extract the exponents, subtracting the bias. Perform whatever handling isrequired by that procedure. Calculate the approximate exponent of the result.5. Get the mantissa.6.Align radix points for fixed-point routines.135NUMERICAL METHODS7. Perform fixed-point arithmetic.8.
Check for initial conditions upon return. If a zero is returned, this is a shortcutexit from the routine.9.Renormalize, handling any underflow or overflow.10. Reassert the sign.11. Write the result and return.Fitting These Routines to an ApplicationOne of the primary purposes of the floating-point routines in this book is toillustrate the inner workings of floating-point arithmetic; they are not assumed to bethe best fit for your system. In addition, all the routines are written as near calls. Thisis adequate for many systems, but you may require far calls (which would require farpointers for the arguments). The functions write their return values to static variables,an undesireable action in multithreaded systems because these values can beoverwritten by another thread. Though the core routines use extended precision, theexponents are not extended; if you choose to extend them, 15 bits are recommended.This way, the exponent and sign bit can fit neatly within one word, allowing as manyas 49 bits of precision in a quadword format.
The exceptions are not fully implemented. If your system needs to detect situations in which the mathematicaloperation results in something that cannot be interpreted as a number, such asSignaling or Quiet NANS, you will have to write that code. Many of the in-line utilityfunctions in the core and external routines may also be rewritten as stand alonesubroutines. Doing so can make handling of the numerics a bit more complex but willreduce the size of the package.These routines work well, but feel free to make any changes you wish to fit yourtarget.
A program on the disk, MATH.C, may help you debug any modifications; Iused this technique to prepare the math routines for this book.Addition and Subtraction: FLADDFladd, the core routine for addition and subtraction, is the longest and mostcomplex routine in FPMATH.ASM (and perhaps the most interesting). We’ll use it136FLOATING-POINT ARITHMETICas an example, dissecting it into the prologue, the addition, and the epilogue.The routine for addition can be used without penalty for subtraction because thesign in the IEEE 754 specification for floating point is signed magnitude. The MSBof the short or long real is a 1 for negative and a 0 for positive. The higher-levelsubtraction routine need only XOR the MSB of the subtrahend before passing theparameters to fladd to make it a subtraction.Addition differs from multiplication or division in at least two respects.
First,one operand may be so much smaller than the other that it will contribute nosignificance to the result. It can save steps to detect this condition early in theoperation. Second, addition can occur anywhere in four quadrants: both operandscan be positive or both negative, the summend can be negative, or the addend can benegative.The first problem is resolved by comparing the difference in the exponents of thetwo operands against the number of significant bits available. Since these routinesuse 40 bits of precision, including extended precision, the difference between theexponents can be no greater than 40. Otherwise no overlap will occur and the answerwill be the greater of the two operands no matter what. (Imagine adding .00000001to 100.0 and expressing the result in eight decimal digits).
Therefore, if the differencebetween the exponents is greater than 40, the larger of the two numbers is the resultand the routine is exited at that point. If the difference is less than 40, the smalleroperand is shifted until the exponents of both operands are equal.If the larger of the two numbers is known, the problem of signs becomes trivial.Whatever the sign of the larger, the smaller operand can never change it throughsubtraction or addition, and the sign of the larger number will be the sign of the result.If the signs of both operands are the same, addition takes place normally; if theydiffer, the smaller of the two is two’s complemented before the addition, making ita subtraction.The fladd routine is broken into four logical sections, so each part of theoperation can be explained more clearly. Each section comprises a pseudocodedescription followed by the actual assembly code listing.137NUMERICAL METHODSFLADD: The Prologue.1.Two quadword variables, opa and opb, are allocated and cleared for use later inthe routine.
Byte variables for the sign of each operand and a general sign byteare also cleared.2.Each operand is checked for zero.If either is zero, the routine exits with the other argument as its answer.3.The upper word of each float is loaded into a register and shifted left once intothe sign byte assigned to that operand. The exponent is then moved to theexponent byte of that operand, exp0 and expl.
Finally, the exponent of thesecond operand is subtracted from the exponent of the first and the differenceplaced in a variable, diff.4.The upper words of the floats are ANDed with 7fH to clear the sign and exponentbits. They’re then ORed with 80H to restore the hidden bit.We now have a fixed-point number in the form 1.xxx.FLADD: The Prologue; *****fladdreprep138procuses bx cx dx si di,fp:qword, fpl:qword, rptr:wordlocalopa:qword, opb:qword, signa:byte,signb:byte, exponent:byte, sign:byte,diff:byte, sign0:byte, sign1:byte,exp0:byte, exp1:bytepushfstd;decrementxorax,ax;clear appropriateleadi,word ptr opa[6];larger operandmovcx,4stoswword ptr [di]lea;smaller operanddi,word ptr opb[6]movcx,4stoswword ptr [di]byte ptr sign0, almovbyte ptr sign1, almovmovbyte ptr sign, al;clear signvariablesFLOATING-POINT ARITHMETICchk_fp0:movlearepenonzerojnzleajmpcx, 3di,word ptr fp0[4]scaswchk_fplsi,word ptr fp1[4]short leave with otherchk_fpl:movlearepecx, 3di,word ptr fp1[4]scaswjnzleado addsi,word ptr fp0[4]; *****leave with other:movaddmovmovswrepjmp;check for zero;di will point to the first;return other addend;di will point to the;first nonzero;return other addenddi,word ptr rptrdi,4cx,3fp_addex; *****do_add:lealeamovshlrclmovmovshlrclmovsubmovsi,word ptr fp0bx,word ptr fplax,word ptr [si][4]ax,1byte ptr sign0, 1byte ptr exp0, ahdx,word ptr [bx][4]dx,lbyte ptr sign1, 1byte ptr exp1, dhah, dhbyte ptr diff, ahrestore-missing-bit:andword ptr fp0[4], 7fhorword ptr fp0[4], 80h;fpO;dump the sign;collect the sign;get the exponent;fpl;get sign;and the exponent;and now the difference;set up operands139NUMERICAL METHODSmovax, word ptr fpl;load these into registers;;we'll use themmovmovandormovbx, word ptr fp1[2]dx, word ptr fp1[4]dx,7fhdx,80hword ptr fp1[4], dxThe FLADD Routine:5.Compare the difference between the exponents.If they're equal, continue with step 6.If the difference is negative, take the second operand as the largestand continue with step 7.If the difference is positive, assume that the first operand is largestand continue with step 8.6.Continue comparing the two operands, most significant words first.If, on any compare except the last, the second operand proves the largest,continue with step 7.If, on any compare except the last, the first operand proves the largest,continue with step 8.If neither is larger to the last compare, continue with step 8 if thesecond operand is larger and step 7 if the first is equal or larger.7.Two's-complement the variable diff and compare it with 40D to determinewhether to go on.If it's out of range, write the value of the second operand to the resultand leave.If it's in range, move the exponent of the second operand to exponent,move the sign of this operand to the variable holding the sign of thelargest operand ,and move the sign of the other operand to the variableholding the sign of the smaller operand.Load this fixed-point operand into opa and continue with step 9.8.Compare diff with 40D to determine whether it's in range.If not, write the value of the first operand to the result and leave.If so, move the exponent of theof this operand to the variableand move the sign of the otherof the smaller operand.