Volume 1 Basic Architecture (794100), страница 54
Текст из файла (страница 54)
The FABS instruction produces the absolute value of the source operand. TheFCHS instruction changes the sign of the source operand. The FXTRACT instructionseparates the source operand into its exponent and fraction and stores each value ina register in floating-point format.8.3.6Comparison and Classification InstructionsThe following instructions compare or classify floating-point values:FCOM/FCOMP/FCOMPPCompare floating point and set x87 FPUcondition code flags.FUCOM/FUCOMP/FUCOMPPUnordered compare floating point and setx87 FPU condition code flags.FICOM/FICOMPCompare integer and set x87 FPUcondition code flags.FCOMI/FCOMIPCompare floating point and set EFLAGSstatus flags.FUCOMI/FUCOMIPUnordered compare floating point andset EFLAGS status flags.FTSTTest (compare floating point with 0.0).FXAMExamine.Comparison of floating-point values differ from comparison of integers becausefloating-point values have four (rather than three) mutually exclusive relationships:less than, equal, greater than, and unordered.The unordered relationship is true when at least one of the two values beingcompared is a NaN or in an unsupported format.
This additional relationship isrequired because, by definition, NaNs are not numbers, so they cannot have lessthan, equal, or greater than relationships with other floating-point values.8-26 Vol. 1PROGRAMMING WITH THE X87 FPUThe FCOM, FCOMP, and FCOMPP instructions compare the value in register ST(0) witha floating-point source operand and set the condition code flags (C0, C2, and C3) inthe x87 FPU status word according to the results (see Table 8-6).If an unordered condition is detected (one or both of the values are NaNs or in anundefined format), a floating-point invalid-operation exception is generated.The pop versions of the instruction pop the x87 FPU register stack once or twice afterthe comparison operation is complete.The FUCOM, FUCOMP, and FUCOMPP instructions operate the same as the FCOM,FCOMP, and FCOMPP instructions.
The only difference is that with the FUCOM,FUCOMP, and FUCOMPP instructions, if an unordered condition is detected becauseone or both of the operands are QNaNs, the floating-point invalid-operation exception is not generated.Table 8-6. Setting of x87 FPU Condition Code Flags for Floating-Point NumberComparisonsConditionC3C2C0ST(0) > Source Operand000ST(0) < Source Operand001ST(0) = Source Operand100Unordered111The FICOM and FICOMP instructions also operate the same as the FCOM and FCOMPinstructions, except that the source operand is an integer value in memory.
Theinteger value is automatically converted into an double extended-precision floatingpoint value prior to making the comparison. The FICOMP instruction pops the x87FPU register stack following the comparison operation.The FTST instruction performs the same operation as the FCOM instruction, exceptthat the value in register ST(0) is always compared with the value 0.0.The FCOMI and FCOMIP instructions were introduced into the IA-32 architecture inthe P6 family processors. They perform the same comparison as the FCOM andFCOMP instructions, except that they set the status flags (ZF, PF, and CF) in theEFLAGS register to indicate the results of the comparison (see Table 8-7) instead ofthe x87 FPU condition code flags.
The FCOMI and FCOMIP instructions allow conditionbranch instructions (Jcc) to be executed directly from the results of their comparison.Vol. 1 8-27PROGRAMMING WITH THE X87 FPUTable 8-7. Setting of EFLAGS Status Flags for Floating-Point Number ComparisonsComparison ResultsZFPFCFST0 > ST(i)000ST0 < ST(i)001ST0 = ST(i)100Unordered111Software can check if the FCOMI and FCOMIP instructions are supported by checkingthe processor’s feature information with the CPUID instruction.The FUCOMI and FUCOMIP instructions operate the same as the FCOMI and FCOMIPinstructions, except that they do not generate a floating-point invalid-operationexception if the unordered condition is the result of one or both of the operands beinga QNaN.
The FCOMIP and FUCOMIP instructions pop the x87 FPU register stackfollowing the comparison operation.The FXAM instruction determines the classification of the floating-point value in theST(0) register (that is, whether the value is zero, a denormal number, a normal finitenumber, ∞, a NaN, or an unsupported format) or that the register is empty.
It sets thex87 FPU condition code flags to indicate the classification (see “FXAM—Examine” inChapter 3, “Instruction Set Reference, A-M,” of the Intel® 64 and IA-32 ArchitecturesSoftware Developer’s Manual, Volume 2A). It also sets the C1 flag to indicate the signof the value.8.3.6.1Branching on the x87 FPU Condition CodesThe processor does not offer any control-flow instructions that branch on the settingof the condition code flags (C0, C2, and C3) in the x87 FPU status word. To branch onthe state of these flags, the x87 FPU status word must first be moved to the AXregister in the integer unit.
The FSTSW AX (store status word) instruction can beused for this purpose. When these flags are in the AX register, the TEST instructioncan be used to control conditional branching as follows:1. Check for an unordered result. Use the TEST instruction to compare the contentsof the AX register with the constant 0400H (see Table 8-8). This operation willclear the ZF flag in the EFLAGS register if the condition code flags indicate anunordered result; otherwise, the ZF flag will be set. The JNZ instruction can thenbe used to transfer control (if necessary) to a procedure for handling unorderedoperands.8-28 Vol. 1PROGRAMMING WITH THE X87 FPUTable 8-8. TEST Instruction Constants for Conditional BranchingOrderConstantBranchST(0) > Source Operand4500HJZST(0) < Source Operand0100HJNZST(0) = Source Operand4000HJNZUnordered0400HJNZ2.
Check ordered comparison result. Use the constants given in Table 8-8 in theTEST instruction to test for a less than, equal to, or greater than result, then usethe corresponding conditional branch instruction to transfer program control tothe appropriate procedure or section of code.If a program or procedure has been thoroughly tested and it incorporates periodicchecks for QNaN results, then it is not necessary to check for the unordered resultevery time a comparison is made.See Section 8.1.4, “Branching and Conditional Moves on Condition Codes,” foranother technique for branching on x87 FPU condition codes.Some non-comparison x87 FPU instructions update the condition code flags in thex87 FPU status word. To ensure that the status word is not altered inadvertently,store it immediately following a comparison operation.8.3.7Trigonometric InstructionsThe following instructions perform four common trigonometric functions:FSINSineFCOSCosineFSINCOSSine and cosineFPTANTangentFPATANArctangentThese instructions operate on the top one or two registers of the x87 FPU registerstack and they return their results to the stack.
The source operands for the FSIN,FCOS, FSINCOS, and FPTAN instructions must be given in radians; the sourceoperand for the FPATAN instruction is given in rectangular coordinate units.The FSINCOS instruction returns both the sine and the cosine of a source operandvalue. It operates faster than executing the FSIN and FCOS instructions in succession.The FPATAN instruction computes the arctangent of ST(1) divided by ST(0),returning a result in radians.
It is useful for converting rectangular coordinates topolar coordinates.Vol. 1 8-29PROGRAMMING WITH THE X87 FPU8.3.8PiWhen the argument (source operand) of a trigonometric function is within the rangeof the function, the argument is automatically reduced by the appropriate multiple of2π through the same reduction mechanism used by the FPREM and FPREM1 instructions. The internal value of π that the x87 FPU uses for argument reduction and othercomputations is as follows:π = 0.f ∗ 22where:f = C90FDAA2 2168C234 C(The spaces in the fraction above indicate 32-bit boundaries.)This internal π value has a 66-bit mantissa, which is 2 bits more than is allowed in thesignificand of an double extended-precision floating-point value.
(Since 66 bits is notan even number of hexadecimal digits, two additional zeros have been added to thevalue so that it can be represented in hexadecimal format. The least-significanthexadecimal digit (C) is thus 1100B, where the two least-significant bits representbits 67 and 68 of the mantissa.)This value of π has been chosen to guarantee no loss of significance in a sourceoperand, provided the operand is within the specified range for the instruction.If the results of computations that explicitly use π are to be used in the FSIN, FCOS,FSINCOS, or FPTAN instructions, the full 66-bit fraction of π should be used. Thisinsures that the results are consistent with the argument-reduction algorithms thatthese instructions use. Using a rounded version of π can cause inaccuracies in resultvalues, which if propagated through several calculations, might result in meaninglessresults.A common method of representing the full 66-bit fraction of π is to separate the valueinto two numbers (highπ and lowπ) that when added together give the value for πshown earlier in this section with the full 66-bit fraction:π = highπ + lowπFor example, the following two values (given in scientific notation with the fraction inhexadecimal and the exponent in decimal) represent the 33 most-significant and the33 least-significant bits of the fraction:highπ (unnormalized) = 0.C90FDAA20 * 2+2lowπ (unnormalized) = 0.42D184698 * 2− 31These values encoded in the IEEE double-precision floating-point format are asfollows:highπ = 400921FB 54400000lowπ = 3DE0B461 1A600000(Note that in the IEEE double-precision floating-point format, the exponents arebiased (by 1023) and the fractions are normalized.)Similar versions of π can also be written in double extended-precision floating-pointformat.8-30 Vol.
1PROGRAMMING WITH THE X87 FPUWhen using this two-part π value in an algorithm, parallel computations should beperformed on each part, with the results kept separate. When all the computationsare complete, the two results can be added together to form the final result.The complications of maintaining a consistent value of π for argument reduction canbe avoided, either by applying the trigonometric functions only to arguments withinthe range of the automatic reduction mechanism, or by performing all argumentreductions (down to a magnitude less than π/4) explicitly in software.8.3.9Logarithmic, Exponential, and ScaleThe following instructions provide two different logarithmic functions, an exponentialfunction and a scale function:FYL2XLogarithmFYL2XP1Logarithm epsilonF2XM1ExponentialFSCALEScaleThe FYL2X and FYL2XP1 instructions perform two different base 2 logarithmic operations. The FYL2X instruction computes (y ∗ log2x).
This operation permits the calculation of the log of any base using the following equation:logb x = (1/log2 b) ∗ log2 xThe FYL2XP1 instruction computes (y ∗ log2(x + 1)). This operation providesoptimum accuracy for values of x that are close to 0.The F2XM1 instruction computes (2x − 1). This instruction only operates on sourcevalues in the range −1.0 to +1.0.The FSCALE instruction multiplies the source operand by a power of 2.8.3.10Transcendental Instruction AccuracyNew transcendental instruction algorithms were incorporated into the IA-32 architecture beginning with the Pentium processors.