Volume 1 Basic Architecture (794100), страница 91
Текст из файла (страница 91)
The filterfunction will record the result (plus any new flag settings).The user level filter function will then call the emulation function for the next set ofsub-operands (if any). When done with all the operand sets, the partial results will bepacked (if the excepting instruction has a packed floating-point result, which is truefor most SSE/SSE2/SSE3 numeric instructions) and the filter will return to the lowlevel exception handler, which in turn will return from the interruption, allowingexecution to continue. Note that the instruction pointer (EIP) has to be altered topoint to the instruction following the excepting instruction, in order to continueexecution correctly.If a user mode floating-point exception filter is not provided, then all the work fordecoding the excepting instruction, reading its operands, emulating the instructionfor the components of the result that do not correspond to unmasked floating-pointexceptions, and providing the compounded result will have to be performed by theuser-provided floating-point exception handler.Actual emulation might have to take place for one operand or pair of operands forscalar operations, and for all sub-operands or pairs of sub-operands for packed operations.
The steps to perform are the following:•The excepting instruction has to be decoded and the operands have to be readfrom the saved context.•The instruction has to be emulated for each (pair of) sub-operand(s); if nofloating-point exception occurs, the partial result has to be saved; if a maskedfloating-point exception occurs, the masked result has to be produced throughemulation and saved, and the appropriate status flags have to be set; if anunmasked floating-point exception occurs, the result has to be generated by theuser provided floating-point exception handler, and the appropriate status flagshave to be set.•The partial results have to be combined and written to the context that will berestored upon application program resumption.Vol. 1 E-5GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERSA diagram of the control flow in handling an unmasked floating-point exception ispresented below.User ApplicationLow-Level Floating-Point Exception HandlerUser Level Floating-Point Exception FilterUser Floating-Point Exception HandlerFigure E-1.
Control Flow for Handling Unmasked Floating-Point ExceptionsFrom the user-level floating-point filter, Example E-2 in Section E.4.3, “ExampleSIMD Floating-Point Emulation Implementation,” will present only the floating-pointemulation part. In order to understand the actions involved, the expected responseto exceptions has to be known for all SSE/SSE2/SSE3 numeric instructions in twosituations: with exceptions enabled (unmasked result), and with exceptions disabled(masked result).
The latter can be found in Section 6.4, “Interrupts and Exceptions.”The response to NaN operands that do not raise an exception is specified in Section4.8.3.4, “NaNs.” Operations on NaNs are explained in the same source. This responseis also discussed in more detail in the next subsection, along with the unmasked andmasked responses to floating-point exceptions.E.4.2SSE/SSE2/SSE3 Response To Floating-Point ExceptionsThis subsection specifies the unmasked response expected from the SSE/SSE2/SSE3instructions that raise floating-point exceptions. The masked response is given inparallel, as it is necessary in the emulation process of the instructions that raiseunmasked floating-point exceptions.
The response to NaN operands is also includedin more detail than in Section 4.8.3.4, “NaNs.” For floating-point exception priority,refer to “Priority Among Simultaneous Exceptions and Interrupts” in Chapter 5,E-6 Vol. 1GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERS“Interrupt and Exception Handling,” of Intel® 64 and IA-32 Architectures SoftwareDeveloper’s Manual, Volume 3A.E.4.2.1Numeric ExceptionsThere are six classes of numeric (floating-point) exception conditions that can occur:Invalid operation (#I), Divide-by-Zero (#Z), Denormal Operand (#D), NumericOverflow (#O), Numeric Underflow (#U), and Inexact Result (precision) (#P).
#I,#Z, #D are pre-computation exceptions (floating-point faults), detected before thearithmetic operation. #O, #U, #P are post-computation exceptions (floating-pointtraps).Users can control how the SSE/SSE2/SSE3 floating-point exceptions are handled bysetting the mask/unmask bits in MXCSR. Masked exceptions are handled by theprocessor, or by software if they are combined with unmasked exceptions occurringin the same instruction.
Unmasked exceptions are usually handled by the low-levelexception handler, in conjunction with user-level software.E.4.2.2Results of Operations with NaN Operands or a NaN Result forSSE/SSE2/SSE3 Numeric InstructionsThe tables below (E-1 through E-10) specify the response of SSE/SSE2/SSE3instructions to NaN inputs, or to other inputs that lead to NaN results.These results will be referenced by subsequent tables (e.g., E-10).
Most operationsdo not raise an invalid exception for quiet NaN operands, but even so, they will havehigher precedence over raising floating-point exceptions other than invalid operation.Note that the single precision QNaN Indefinite value is 0xffc00000, the double precision QNaN Indefinite value is 0xfff8000000000000, and the Integer Indefinite valueis 0x80000000 (not a floating-point number, but it can be the result of a conversioninstruction from floating-point to integer).For an unmasked exception, no result will be provided by the hardware to the userhandler. If a user registered floating-point exception handler is invoked, it mayprovide a result for the excepting instruction, that will be used if execution of theapplication code is continued after returning from the interruption.In Tables E-1 through Table E-12, the specified operands cause an invalid exception,unless the unmasked result is marked with “not an exception”.
In this latter case, theunmasked and masked results are the same.Vol. 1 E-7GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERSTable E-1. ADDPS, ADDSS, SUBPS, SUBSS, MULPS, MULSS, DIVPS, DIVSS, ADDPD,ADDSD, SUBPD, SUBSD, MULPD, MULSD, DIVPD, DIVSD, ADDSUBPS, ADDSUBPD,HADDPS, HADDPD, HSUBPS, HSUBPDSource OperandsMasked ResultUnmasked ResultSNaN1 op SNaN2SNaN1 | 00400000H orSNaN1 |0008000000000000H2NoneSNaN1 op QNaN2SNaN1 | 00400000H orSNaN1 |0008000000000000H2NoneQNaN1 op SNaN2QNaN1NoneQNaN1 op QNaN2QNaN1QNaN1 (not an exception)SNaN op real valueSNaN | 00400000H orSNaN1 |0008000000000000H2NoneReal value op SNaNSNaN | 00400000H orSNaN1 |0008000000000000H2NoneQNaN op real valueQNaNQNaN (not an exception)Real value op QNaNQNaNQNaN (not an exception)Neither source operand isSNaN,but #I is signaled (e.g.
for Inf Inf,Inf ∗ 0, Inf / Inf, 0/0)Single precision or doubleprecision QNaN IndefiniteNone1NOTES:1. For Tables E-1 to E-12: op denotes the operation to be performed.2. SNaN | 0x00400000 is a quiet NaN in single precision format (if SNaN is in single precision) andSNaN | 0008000000000000H is a quiet NaN in double precision format (if SNaN is in doubleprecision), obtained from the signaling NaN given as input.3. Operations involving only quiet NaNs do not raise floating-point exceptions.E-8 Vol.
1GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERSTable E-2. CMPPS.EQ, CMPSS.EQ, CMPPS.ORD, CMPSS.ORD,CMPPD.EQ, CMPSD.EQ, CMPPD.ORD, CMPSD.ORDSource OperandsMasked ResultUnmasked ResultNaN op Opd2 (any Opd2)00000000H or0000000000000000H100000000H or0000000000000000H1 (notan exception)Opd1 op NaN (any Opd1)00000000H or0000000000000000H100000000H or0000000000000000H1 (notan exception)NOTE:1. 32-bit results are for single, and 64-bit results for double precision operations.Table E-3.
CMPPS.NEQ, CMPSS.NEQ, CMPPS.UNORD, CMPSS.UNORD, CMPPD.NEQ,CMPSD.NEQ, CMPPD.UNORD, CMPSD.UNORDSource OperandsMasked ResultUnmasked ResultNaN op Opd2 (any Opd2)FFFFFFFFH orFFFFFFFFFFFFFFFFH1FFFFFFFFH orFFFFFFFFFFFFFFFFH1 (not anexception)Opd1 op NaN (any Opd1)FFFFFFFFH orFFFFFFFFFFFFFFFFH1FFFFFFFFH orFFFFFFFFFFFFFFFFH1 (not anexception)NOTE:1.
32-bit results are for single, and 64-bit results for double precision operations.Table E-4. CMPPS.LT, CMPSS.LT, CMPPS.LE, CMPSS.LE, CMPPD.LT, CMPSD.LT,CMPPD.LE, CMPSD.LESource OperandsMasked ResultUnmasked ResultNaN op Opd2 (any Opd2)00000000H or0000000000000000H1NoneOpd1 op NaN (any Opd1)00000000H or0000000000000000H1NoneNOTE:1. 32-bit results are for single, and 64-bit results for double precision operations.Vol. 1 E-9GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERSTable E-5.
CMPPS.NLT, CMPSS.NLT, CMPPS.NLE, CMPSS.NLE, CMPPD.NLT, CMPSD.NLT,CMPPD.NLE, CMPSD.NLESource OperandsMasked ResultUnmasked ResultNaN op Opd2 (any Opd2)FFFFFFFFH orFFFFFFFFFFFFFFFFH1NoneOpd1 op NaN (any Opd1)FFFFFFFFH orFFFFFFFFFFFFFFFFH1NoneNOTE:1. 32-bit results are for single, and 64-bit results for double precision operations.Table E-6.
COMISS, COMISDSource OperandsMasked ResultUnmasked ResultSNaN op Opd2 (any Opd2)OF, SF, AF = 000ZF, PF, CF = 111NoneOpd1 op SNaN (any Opd1)OF, SF, AF = 000ZF, PF, CF = 111NoneQNaN op Opd2 (any Opd2)OF, SF, AF = 000ZF, PF, CF = 111NoneOpd1 op QNaN (any Opd1)OF, SF, AF = 000ZF, PF, CF = 111NoneTable E-7. UCOMISS, UCOMISDSource OperandsMasked ResultUnmasked ResultSNaN op Opd2 (any Opd2)OF, SF, AF = 000ZF, PF, CF = 111NoneOpd1 op SNaN (any Opd1)OF, SF, AF = 000ZF, PF, CF = 111NoneQNaN op Opd2(any Opd2 ≠ SNaN)OF, SF, AF = 000ZF, PF, CF = 111OF, SF, AF = 000ZF, PF, CF = 111 (not anexception)Opd1 op QNaN(any Opd1 ≠ SNaN)OF, SF, AF = 000ZF, PF, CF = 111OF, SF, AF = 000ZF, PF, CF = 111 (not anexception)E-10 Vol.
1GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERSTable E-8. CVTPS2PI, CVTSS2SI, CVTTPS2PI, CVTTSS2SI, CVTPD2PI, CVTSD2SI,CVTTPD2PI, CVTTSD2SI, CVTPS2DQ, CVTTPS2DQ, CVTPD2DQ, CVTTPD2DQSource OperandMasked ResultUnmasked ResultSNaN80000000H or80000000000000001(Integer Indefinite)NoneQNaN80000000H or80000000000000001(Integer Indefinite)NoneNOTE:1. 32-bit results are for single, and 64-bit results for double precision operations.Table E-9.