Volume 1 Basic Architecture (794100), страница 81
Текст из файла (страница 81)
1 C-7FLOATING-POINT EXCEPTIONS SUMMARYTable C-4. Exceptions Generated with SSE2 Instructions (Contd.)InstructionDescriptionCVTTPS2DQConvert four SP FP fromXMM/Mem to four 32-bitsigned integers in XMM usingtruncate.CVTDQ2PDConvert two 32-bit signedintegers in XMM2/Mem to 2DP FP in xmm1 using roundingspecified by MXCSR.CVTPD2DQ#I#D#Z#O#U#PYYConvert two DP FP fromXMM2/Mem to two 32-bitsigned integers in xmm1 usingrounding specified by MXCSR.YYCVTPD2PIConvert lower two DP FP fromXMM/Mem to two 32-bitsigned integers in MM usingrounding specified by MXCSR.YYCVTPD2PSConvert two DP FP to two SPFP.YYCVTPI2PDConvert two 32-bit signedintegers from MM2/Mem totwo DP FP.CVTPS2PDConvert two SP FP to two DPFP.YYCVTSD2SIConvert one DP FP fromXMM/Mem to one 32 bitsigned integer using roundingmode specified by MXCSR, andmove the result to an integerregister.YCVTSD2SSConvert scalar DP FP to scalarSP FP.YYCVTSI2SDConvert one 32-bit signedinteger from Integer Reg/Memto one DP FP.CVTSS2SDConvert scalar SP FP to scalarDP FP.YYC-8 Vol.
1YYYYYYYFLOATING-POINT EXCEPTIONS SUMMARYTable C-4. Exceptions Generated with SSE2 Instructions (Contd.)InstructionDescription#I#D#Z#O#U#PCVTTPD2DQConvert two DP FP fromXMM2/Mem to two 32-bitsigned integers in XMM1 usingtruncate.YYCVTTPD2PIConvert two DP FP fromXMM2/Mem to two 32-bitsigned integers in MM1 usingtruncate.YYCVTTSD2SIConvert lowest DP FP fromXMM/Mem to one 32 bitsigned integer using truncate,and move the result to aninteger register.YYDIVPDDivide packed DP FP numbersin XMM1 by XMM2/MemYYYYYYDIVSDDivide lower DP FP numbers inXMM1 by XMM2/MemYYYYYYMAXPDReturn the maximum DP FPnumbers between XMM2/Memand XMM1.YYMAXSDReturn the maximum DP FPnumber between the lower DPFP numbers from XMM2/Memand XMM1.YYMINPDReturn the minimum DPnumbers between XMM2/Memand XMM1.YYMINSDReturn the minimum DP FPnumber between the lowestDP FP numbers fromXMM2/Mem and XMM1.YYMOVAPDMove 128 bits representing 2packed DP data fromXMM2/Mem to XMM1 register.Or Move 128 bits representing2 packed DP from XMM1register to XMM2/Mem.Vol.
1 C-9FLOATING-POINT EXCEPTIONS SUMMARYTable C-4. Exceptions Generated with SSE2 Instructions (Contd.)InstructionMOVHPDDescription#I#D#Z#O#U#PMove 64 bits representing oneDP operand from Mem toupper field of XMM register.Or move 64 bits representingone DP operand from upperfield of XMM register to Mem.MOVLPDMove 64 bits representing oneDP operand from Mem tolower field of XMM register.Or move 64 bits representingone DP operand from lowerfield of XMM register to Mem.MOVMSKPDMove the sign mask to r32.MOVSDMove 64 bits representing onescalar DP operand fromXMM2/Mem to XMM1 register.Or move 64 bits representingone scalar DP operand fromXMM1 register to XMM2/Mem.MOVUPDMove 128 bits representing 2DP data from XMM2/Mem toXMM1 register.Or move 128 bits representing2 DP data from XMM1 registerto XMM2/Mem.MULPDMultiply packed DP FPnumbers in XMM2/Mem toXMM1.YYYYYMULSDMultiply the lowest DP FPnumber in XMM2/Mem toXMM1.YYYYYORPDOR 128 bits from XMM2/Memto XMM1 register.SHUFPDShuffle Double.SQRTPDSquare Root Packed DoublePrecisionYYYSQRTSDSquare Root Scaler DoublePrecisionYYYC-10 Vol.
1FLOATING-POINT EXCEPTIONS SUMMARYTable C-4. Exceptions Generated with SSE2 Instructions (Contd.)InstructionDescription#I#D#Z#O#U#PSUBPDSubtract Packed DoublePrecision.YYYYYSUBSDSubtract Scaler DoublePrecision.YYYYYUCOMISDCompare lower DP FP numberin XMM1 register with lowerDP FP number in XMM2/Memand set the status flagsaccordingly.YYUNPCKHPDInterleaves DP FP numbersfrom the high halves of XMM1and XMM2/Mem into XMM1register.UNPCKLPDInterleaves DP FP numbersfrom the low halves of XMM1and XMM2/Mem into XMM1register.XORPDXOR 128 bits fromXMM2/Mem to XMM1 register.C.5SSE3 INSTRUCTIONSTable C-5 lists the SSE3 instructions that have at least one of the followingcharacteristics:••have floating-point operandsgenerate floating-point resultsFor each instruction, the table summarizes the floating-point exceptions that theinstruction can generate.Table C-5.
Exceptions Generated with SSE3 InstructionsInstructionDescription#I#DADDSUBPDAdd /Sub packed DP FPnumbers from XMM2/Mem toXMM1.YADDSUBPSAdd /Sub packed SP FPnumbers from XMM2/Mem toXMM1.Y#Z#O#U#PYYYYYYYYVol. 1 C-11FLOATING-POINT EXCEPTIONS SUMMARYTable C-5. Exceptions Generated with SSE3 Instructions (Contd.)InstructionDescription#O#U#PFISTTPSee Table C-2.YHADDPDAdd horizontally packed DPFP numbers XMM2/Mem toXMM1.YYYYYHADDPSAdd horizontally packed SPFP numbers XMM2/Mem toXMM1YYYYYHSUBPDSub horizontally packed DPFP numbers XMM2/Mem toXMM1YYYYYHSUBPSSub horizontally packed SPFP numbers XMM2/Mem toXMM1YYYYYLDDQULoad unaligned integer 128bit.MOVDDUPMove 64 bits representingone DP data fromXMM2/Mem to XMM1 andduplicate.MOVSHDUPMove 128 bits representing 4SP data from XMM2/Mem toXMM1 and duplicate high.MOVSLDUPMove 128 bits representing 4SP data from XMM2/Mem toXMM1 and duplicate low.C.6#I#D#ZYSSSE3 INSTRUCTIONSSSSE3 instructions operate on integer data elements.
They do not generate floatingpoint exceptions.C-12 Vol. 1APPENDIX DGUIDELINES FOR WRITING X87 FPUEXCEPTION HANDLERSAs described in Chapter 8, “Programming with the x87 FPU,” the IA-32 Architecturesupports two mechanisms for accessing exception handlers to handle unmasked x87FPU exceptions: native mode and MS-DOS compatibility mode. The primary purposeof this appendix is to provide detailed information to help software engineers designand write x87 FPU exception-handling facilities to run on PC systems that use theMS-DOS compatibility mode1 for handling x87 FPU exceptions. Some of the information in this appendix will also be of interest to engineers who are writing native-modex87 FPU exception handlers.
The information provided is as follows:•Discussion of the origin of the MS-DOS x87 FPU exception handling mechanismand its relationship to the x87 FPU’s native exception handling mechanism.•Description of the IA-32 flags and processor pins that control the MS-DOS x87FPU exception handling mechanism.•Description of the external hardware typically required to support MS-DOSexception handling mechanism.•Description of the x87 FPU’s exception handling mechanism and the typicalprotocol for x87 FPU exception handlers.•••Code examples that demonstrate various levels of x87 FPU exception handlers.Discussion of x87 FPU considerations in multitasking environments.Discussion of native mode x87 FPU exception handling.The information given is oriented toward the most recent generations of IA-32processors, starting with the Intel486. It is intended to augment the reference information given in Chapter 8, “Programming with the x87 FPU.”A more extensive version of this appendix is available in the application note AP-578,Software and Hardware Considerations for x87 FPU Exception Handlers for IntelArchitecture Processors (Order Number 243291), which is available from Intel.D.1MS-DOS COMPATIBILITY SUB-MODE FOR HANDLINGX87 FPU EXCEPTIONSThe first generations of IA-32 processors (starting with the Intel 8086 and 8088processors and going through the Intel 286 and Intel386 processors) did not have an1Microsoft Windows* 95 and Windows 3.1 (and earlier versions) operating systems use almostthe same x87 FPU exception handling interface as MS-DOS.
The recommendations in this appendix for a MS-DOS compatible exception handler thus apply to all three operating systems.Vol. 1 D-1GUIDELINES FOR WRITING X87 FPU EXCEPTION HANDLERSon-chip floating-point unit. Instead, floating-point capability was provided on a separate numeric coprocessor chip.
The first of these numeric coprocessors was the Intel8087, which was followed by the Intel 287 and Intel 387 numeric coprocessors.To allow the 8087 to signal floating-point exceptions to its companion 8086 or 8088,the 8087 has an output pin, INT, which it asserts when an unmasked floating-pointexception occurs. The designers of the 8087 recommended that the output from thispin be routed through a programmable interrupt controller (PIC) such as the Intel8259A to the INTR pin of the 8086 or 8088.