Volume 1 Basic Architecture (794100), страница 50
Текст из файла (страница 50)
Assemblers support these register addressingmodes, using the expression ST(0), or simply ST, to represent the current stack topand ST(i) to specify the ith register from TOP in the stack (0 ≤ i ≤ 7). For example, ifTOP contains 011B (register 3 is the top of the stack), the following instruction wouldadd the contents of two registers in the stack (registers 3 and 5):FADD ST, ST(2);Figure 8-3 shows an example of how the stack structure of the x87 FPU registers andinstructions are typically used to perform a series of computations. Here, a twodimensional dot product is computed, as follows:1. The first instruction (FLD value1) decrements the stack register pointer (TOP)and loads the value 5.6 from memory into ST(0). The result of this operation isshown in snap-shot (a).2. The second instruction multiplies the value in ST(0) by the value 2.4 frommemory and stores the result in ST(0), shown in snap-shot (b).3.
The third instruction decrements TOP and loads the value 3.8 in ST(0).4. The fourth instruction multiplies the value in ST(0) by the value 10.3 frommemory and stores the result in ST(0), shown in snap-shot (c).5. The fifth instruction adds the value and the value in ST(1) and stores the result inST(0), shown in snap-shot (d).8-4 Vol. 1PROGRAMMING WITH THE X87 FPUComputationDot Product = (5.6 x 2.4) + (3.8 x 10.3)Code:FLD value1FMUL value2FLD value3FMUL value4FADD ST(1)(a);(a) value1 = 5.6;(b) value2 = 2.4; value3 = 3.8;(c)value4 = 10.3;(d)(c)(b)(d)R7R7R7R7R6R6R6R6R5R5R5R5R45.6ST(0)R413.44ST(0) R413.44ST(1)R413.44ST(39.14ST(0)R352.58STR3R3R3R2R2R2R2R1R1R1R1R0R0R0R0Figure 8-3. Example x87 FPU Dot Product ComputationThe style of programming demonstrated in this example is supported by the floatingpoint instruction set.
In cases where the stack structure causes computation bottlenecks, the FXCH (exchange x87 FPU register contents) instruction can be used tostreamline a computation.8.1.2.1Parameter Passing With the x87 FPU Register StackLike the general-purpose registers, the contents of the x87 FPU data registers areunaffected by procedure calls, or in other words, the values are maintained acrossprocedure boundaries. A calling procedure can thus use the x87 FPU data registers(as well as the procedure stack) for passing parameter between procedures. Thecalled procedure can reference parameters passed through the register stack usingthe current stack register pointer (TOP) and the ST(0) and ST(i) nomenclature. It isalso common practice for a called procedure to leave a return value or result inregister ST(0) when returning execution to the calling procedure or program.When mixing MMX and x87 FPU instructions in the procedures or code sequences,the programmer is responsible for maintaining the integrity of parameters beingpassed in the x87 FPU data registers.
If an MMX instruction is executed before theparameters in the x87 FPU data registers have been passed to another procedure,the parameters may be lost (see Section 9.5, “Compatibility with x87 FPU Architecture”).Vol. 1 8-5PROGRAMMING WITH THE X87 FPU8.1.3x87 FPU Status RegisterThe 16-bit x87 FPU status register (see Figure 8-4) indicates the current state of thex87 FPU. The flags in the x87 FPU status register include the FPU busy flag, top-ofstack (TOP) pointer, condition code flags, error summary status flag, stack fault flag,and exception flags.
The x87 FPU sets the flags in this register to show the results ofoperations.FPU BusyTop of Stack Pointer15 14 13CB311 10 9 8 7 6 5 4 3 2 1 0TOPC C C E S P U O Z D I2 1 0 S F E E E E E EConditionCodeError Summary StatusStack FaultException FlagsPrecisionUnderflowOverflowZero DivideDenormalized OperandInvalid OperationFigure 8-4. x87 FPU Status WordThe contents of the x87 FPU status register (referred to as the x87 FPU status word)can be stored in memory using the FSTSW/FNSTSW, FSTENV/FNSTENV,FSAVE/FNSAVE, and FXSAVE instructions.
It can also be stored in the AX register ofthe integer unit, using the FSTSW/FNSTSW instructions.8.1.3.1Top of Stack (TOP) PointerA pointer to the x87 FPU data register that is currently at the top of the x87 FPUregister stack is contained in bits 11 through 13 of the x87 FPU status word. Thispointer, which is commonly referred to as TOP (for top-of-stack), is a binary valuefrom 0 to 7. See Section 8.1.2, “x87 FPU Data Registers,” for more informationabout the TOP pointer.8.1.3.2Condition Code FlagsThe four condition code flags (C0 through C3) indicate the results of floating-pointcomparison and arithmetic operations. Table 8-1 summarizes the manner in whichthe floating-point instructions set the condition code flags. These condition code bits8-6 Vol.
1PROGRAMMING WITH THE X87 FPUare used principally for conditional branching and for storage of information used inexception handling (see Section 8.1.4, “Branching and Conditional Moves on Condition Codes”).As shown in Table 8-1, the C1 condition code flag is used for a variety of functions.When both the IE and SF flags in the x87 FPU status word are set, indicating a stackoverflow or underflow exception (#IS), the C1 flag distinguishes between overflow(C1 = 1) and underflow (C1 = 0). When the PE flag in the status word is set, indicating an inexact (rounded) result, the C1 flag is set to 1 if the last rounding by theinstruction was upward. The FXAM instruction sets C1 to the sign of the value beingexamined.The C2 condition code flag is used by the FPREM and FPREM1 instructions to indicatean incomplete reduction (or partial remainder).
When a successful reduction hasbeen completed, the C0, C3, and C1 condition code flags are set to the three leastsignificant bits of the quotient (Q2, Q1, and Q0, respectively). See “FPREM1—PartialRemainder” in Chapter 3, “Instruction Set Reference, A-M,” of the Intel® 64 andIA-32 Architectures Software Developer’s Manual, Volume 2A, for more informationon how these instructions use the condition code flags.The FPTAN, FSIN, FCOS, and FSINCOS instructions set the C2 flag to 1 to indicatethat the source operand is beyond the allowable range of ±263 and clear the C2 flagif the source operand is within the allowable range.Where the state of the condition code flags are listed as undefined in Table 8-1, donot rely on any specific value in these flags.8.1.3.3x87 FPU Floating-Point Exception FlagsThe six x87 FPU floating-point exception flags (bits 0 through 5) of the x87 FPUstatus word indicate that one or more floating-point exceptions have been detectedsince the bits were last cleared.
The individual exception flags (IE, DE, ZE, OE, UE,and PE) are described in detail in Section 8.4, “x87 FPU Floating-Point ExceptionHandling.” Each of the exception flags can be masked by an exception mask bit in thex87 FPU control word (see Section 8.1.5, “x87 FPU Control Word”). The exceptionsummary status flag (ES, bit 7) is set when any of the unmasked exception flags areset. When the ES flag is set, the x87 FPU exception handler is invoked, using one ofthe techniques described in Section 8.7, “Handling x87 FPU Exceptions in Software.”(Note that if an exception flag is masked, the x87 FPU will still set the appropriateflag if the associated exception occurs, but it will not set the ES flag.)The exception flags are “sticky” bits (once set, they remain set until explicitlycleared).
They can be cleared by executing the FCLEX/FNCLEX (clear exceptions)instructions, by reinitializing the x87 FPU with the FINIT/FNINIT or FSAVE/FNSAVEinstructions, or by overwriting the flags with an FRSTOR or FLDENV instruction.The B-bit (bit 15) is included for 8087 compatibility only. It reflects the contents ofthe ES flag.Vol. 1 8-7PROGRAMMING WITH THE X87 FPUTable 8-1. Condition Code InterpretationInstructionC0C3C2FCOM, FCOMP, FCOMPP,FICOM, FICOMP, FTST,FUCOM, FUCOMP, FUCOMPPResult of ComparisonFCOMI, FCOMIP, FUCOMI,FUCOMIPUndefined. (These instructions set thestatus flags in the EFLAGS register.)#ISOperand classSignFXAMFPREM, FPREM1Q2Operandsare notComparableC1Q10 = reductioncomplete0 or #ISQ0 or #IS1 = reductionincompleteF2XM1, FADD, FADDP,FBSTP, FCMOVcc, FIADD,FDIV, FDIVP, FDIVR, FDIVRP,FIDIV, FIDIVR, FIMUL, FIST,FISTP, FISUB, FISUBR,FMUL,FMULP, FPATAN, FRNDINT,FSCALE, FST, FSTP, FSUB,FSUBP, FSUBR,FSUBRP,FSQRT, FYL2X,FYL2XP1UndefinedFCOS, FSIN, FSINCOS,FPTANUndefinedRoundup or #IS0 = sourceoperandwithin range1 = sourceoperand outof rangeFABS, FBLD, FCHS,FDECSTP, FILD, FINCSTP,FLD, Load Constants, FSTP(ext.
prec.), FXCH, FXTRACTUndefinedFLDENV, FRSTOREach bit loaded from memoryFFREE, FLDCW,FCLEX/FNCLEX, FNOP,FSTCW/FNSTCW,FSTENV/FNSTENV,FSTSW/FNSTSW,FINIT/FNINIT,FSAVE/FNSAVE8-8 Vol. 1Roundup or #IS(Undefined ifC2 = 1)0 or #ISUndefined0000PROGRAMMING WITH THE X87 FPU8.1.3.4Stack Fault FlagThe stack fault flag (bit 6 of the x87 FPU status word) indicates that stack overflow orstack underflow has occurred with data in the x87 FPU data register stack.
The x87FPU explicitly sets the SF flag when it detects a stack overflow or underflow condition, but it does not explicitly clear the flag when it detects an invalid-arithmeticoperand condition.When this flag is set, the condition code flag C1 indicates the nature of the fault:overflow (C1 = 1) and underflow (C1 = 0). The SF flag is a “sticky” flag, meaningthat after it is set, the processor does not clear it until it is explicitly instructed to doso (for example, by an FINIT/FNINIT, FCLEX/FNCLEX, or FSAVE/FNSAVE instruction).See Section 8.1.7, “x87 FPU Tag Word,” for more information on x87 FPU stack faults.8.1.4Branching and Conditional Moves on Condition CodesThe x87 FPU (beginning with the P6 family processors) supports two mechanisms forbranching and performing conditional moves according to comparisons of twofloating-point values.
These mechanism are referred to here as the “old mechanism”and the “new mechanism.”The old mechanism is available in x87 FPU’s prior to the P6 family processors and inP6 family processors. This mechanism uses the floating-point compare instructions(FCOM, FCOMP, FCOMPP, FTST, FUCOMPP, FICOM, and FICOMP) to compare twofloating-point values and set the condition code flags (C0 through C3) according tothe results. The contents of the condition code flags are then copied into the statusflags of the EFLAGS register using a two step process (see Figure 8-5):1.