Volume 2A Instruction Set Reference A-M (794101), страница 41
Текст из файла (страница 41)
The comparison predicate operand(third operand) specifies the type of comparison performed. The comparison result isa quadword mask of all 1s (comparison true) or all 0s (comparison false).The source operand can be an XMM register or a 64-bit memory location. The destination operand is an XMM register. The result is stored in the low quadword of thedestination operand; the high quadword remains unchanged. The comparison predicate operand is an 8-bit immediate, the first 3 bits of which define the type ofcomparison to be made (see Table 3-7). Bits 3 through 7 of the immediate arereserved.The unordered relationship is true when at least one of the two source operandsbeing compared is a NaN; the ordered relationship is true when neither sourceoperand is a NaN.A subsequent computational instruction that uses the mask result in the destinationoperand as an input operand will not generate a fault, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.Some of the comparisons listed in Table 3-7 can be achieved only through softwareemulation.
For these comparisons the program must swap the operands (copyingregisters when necessary to protect the data that will now be in the destinationoperand), and then perform the compare using a different predicate. The predicateto be used for these emulations is listed in Table 3-7 under the heading Emulation.Compilers and assemblers may implement the following two-operand pseudo-ops inaddition to the three-operand CMPSD instruction.
See Table 3-10.CMPSD—Compare Scalar Double-Precision Floating-Point ValuesVol. 2A 3-177INSTRUCTION SET REFERENCE, A-MTable 3-10. Pseudo-Ops and CMPSDPseudo-OpImplementationCMPEQSD xmm1, xmm2CMPSD xmm1,xmm2, 0CMPLTSD xmm1, xmm2CMPSD xmm1,xmm2, 1CMPLESD xmm1, xmm2CMPSD xmm1,xmm2, 2CMPUNORDSD xmm1, xmm2CMPSD xmm1,xmm2, 3CMPNEQSD xmm1, xmm2CMPSD xmm1,xmm2, 4CMPNLTSD xmm1, xmm2CMPSD xmm1,xmm2, 5CMPNLESD xmm1, xmm2CMPSD xmm1,xmm2, 6CMPORDSD xmm1, xmm2CMPSD xmm1,xmm2, 7The greater-than relations not implemented in the processor require more than oneinstruction to emulate in software and therefore should not be implemented aspseudo-ops.
(For these, the programmer should reverse the operands of the corresponding less than relations and use move instructions to ensure that the mask ismoved to the correct destination register and that the source operand is left intact.)In 64-bit mode, use of the REX.R prefix permits this instruction to access additionalregisters (XMM8-XMM15).OperationCASE (COMPARISON PREDICATE) OF0: OP ← EQ;1: OP ← LT;2: OP ← LE;3: OP ← UNORD;4: OP ← NEQ;5: OP ← NLT;6: OP ← NLE;7: OP ← ORD;DEFAULT: Reserved;CMP0 ← DEST[63:0] OP SRC[63:0];IF CMP0 = TRUETHEN DEST[63:0] ← FFFFFFFFFFFFFFFFH;ELSE DEST[63:0] ← 0000000000000000H; FI;(* DEST[127:64] unchanged *)3-178 Vol.
2ACMPSD—Compare Scalar Double-Precision Floating-Point ValuesINSTRUCTION SET REFERENCE, A-MIntel C/C++ Compiler Intrinsic EquivalentsCMPSD for equality__m128d _mm_cmpeq_sd(__m128d a, __m128d b)CMPSD for less-than__m128d _mm_cmplt_sd(__m128d a, __m128d b)CMPSD for less-than-or-equal__m128d _mm_cmple_sd(__m128d a, __m128d b)CMPSD for greater-than__m128d _mm_cmpgt_sd(__m128d a, __m128d b)CMPSD for greater-than-or-equal__m128d _mm_cmpge_sd(__m128d a, __m128d b)CMPSD for inequality__m128d _mm_cmpneq_sd(__m128d a, __m128d b)CMPSD for not-less-than__m128d _mm_cmpnlt_sd(__m128d a, __m128d b)CMPSD for not-greater-than__m128d _mm_cmpngt_sd(__m128d a, __m128d b)CMPSD for not-greater-than-or-equal__m128d _mm_cmpnge_sd(__m128d a, __m128d b)CMPSD for ordered__m128d _mm_cmpord_sd(__m128d a, __m128d b)CMPSD for unordered__m128d _mm_cmpunord_sd(__m128d a, __m128d b)CMPSD for not-less-than-or-equal__m128d _mm_cmpnle_sd(__m128d a, __m128d b)SIMD Floating-Point ExceptionsInvalid if SNaN operand, Invalid if QNaN and predicate as listed in above table,Denormal.Protected Mode Exceptions#GP(0)For an illegal memory operand effective address in the CS, DS,ES, FS or GS segments.#SS(0)For an illegal address in the SS segment.#PF(fault-code)For a page fault.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.#UDIf the LOCK prefix is used.#AC(0)If alignment checking is enabled and an unaligned memoryreference is made while the current privilege level is 3.Real-Address Mode Exceptions#GPIf any part of the operand lies outside the effective addressspace from 0 to FFFFH.CMPSD—Compare Scalar Double-Precision Floating-Point ValuesVol.
2A 3-179INSTRUCTION SET REFERENCE, A-M#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)If alignment checking is enabled and an unaligned memoryreference is made.Compatibility Mode ExceptionsSame exceptions as in protected mode.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.#PF(fault-code)For a page fault.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#AC(0)3-180 Vol.
2AIf alignment checking is enabled and an unaligned memoryreference is made while the current privilege level is 3.CMPSD—Compare Scalar Double-Precision Floating-Point ValuesINSTRUCTION SET REFERENCE, A-MCMPSS—Compare Scalar Single-Precision Floating-Point ValuesOpcodeInstructionOp/EnF3 0F C2 /r ibCMPSS xmm1,Axmm2/m32, imm864-BitModeCompat/ DescriptionLeg ModeValidValidCompare low singleprecision floating-pointvalue in xmm2/m32 andxmm1 using imm8 ascomparison predicate.Instruction Operand EncodingOp/EnOperand 1Operand 2Operand 3Operand 4AModRM:reg (r, w)ModRM:r/m (r)imm8NADescriptionCompares the low single-precision floating-point values in the source operand(second operand) and the destination operand (first operand) and returns the resultsof the comparison to the destination operand.
The comparison predicate operand(third operand) specifies the type of comparison performed. The comparison result isa doubleword mask of all 1s (comparison true) or all 0s (comparison false).The source operand can be an XMM register or a 32-bit memory location. The destination operand is an XMM register. The result is stored in the low doubleword of thedestination operand; the 3 high-order doublewords remain unchanged. The comparison predicate operand is an 8-bit immediate, the first 3 bits of which define the typeof comparison to be made (see Table 3-7). Bits 3 through 7 of the immediate arereserved.The unordered relationship is true when at least one of the two source operandsbeing compared is a NaN; the ordered relationship is true when neither sourceoperand is a NaNA subsequent computational instruction that uses the mask result in the destinationoperand as an input operand will not generate a fault, since a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.Some of the comparisons listed in Table 3-7 can be achieved only through softwareemulation.
For these comparisons the program must swap the operands (copyingregisters when necessary to protect the data that will now be in the destinationoperand), and then perform the compare using a different predicate. The predicateto be used for these emulations is listed in Table 3-7 under the heading Emulation.Compilers and assemblers may implement the following two-operand pseudo-ops inaddition to the three-operand CMPSS instruction. See Table 3-11.CMPSS—Compare Scalar Single-Precision Floating-Point ValuesVol.
2A 3-181INSTRUCTION SET REFERENCE, A-MTable 3-11. Pseudo-Ops and CMPSSPseudo-OpCMPSS ImplementationCMPEQSS xmm1, xmm2CMPSS xmm1, xmm2, 0CMPLTSS xmm1, xmm2CMPSS xmm1, xmm2, 1CMPLESS xmm1, xmm2CMPSS xmm1, xmm2, 2CMPUNORDSS xmm1, xmm2CMPSS xmm1, xmm2, 3CMPNEQSS xmm1, xmm2CMPSS xmm1, xmm2, 4CMPNLTSS xmm1, xmm2CMPSS xmm1, xmm2, 5CMPNLESS xmm1, xmm2CMPSS xmm1, xmm2, 6CMPORDSS xmm1, xmm2CMPSS xmm1, xmm2, 7The greater-than relations not implemented in the processor require more than oneinstruction to emulate in software and therefore should not be implemented aspseudo-ops. (For these, the programmer should reverse the operands of the corresponding less than relations and use move instructions to ensure that the mask ismoved to the correct destination register and that the source operand is left intact.)In 64-bit mode, use of the REX.R prefix permits this instruction to access additionalregisters (XMM8-XMM15).OperationCASE (COMPARISON PREDICATE) OF0: OP ← EQ;1: OP ← LT;2: OP ← LE;3: OP ← UNORD;4: OP ← NEQ;5: OP ← NLT;6: OP ← NLE;7: OP ← ORD;DEFAULT: Reserved;CMP0 ← DEST[31:0] OP SRC[31:0];IF CMP0 = TRUETHEN DEST[31:0] ← FFFFFFFFH;ELSE DEST[31:0] ← 00000000H; FI;(* DEST[127:32] unchanged *)Intel C/C++ Compiler Intrinsic EquivalentsCMPSS for equality3-182 Vol.
2A__m128 _mm_cmpeq_ss(__m128 a, __m128 b)CMPSS—Compare Scalar Single-Precision Floating-Point ValuesINSTRUCTION SET REFERENCE, A-MCMPSS for less-than__m128 _mm_cmplt_ss(__m128 a, __m128 b)CMPSS for less-than-or-equal__m128 _mm_cmple_ss(__m128 a, __m128 b)CMPSS for greater-than__m128 _mm_cmpgt_ss(__m128 a, __m128 b)CMPSS for greater-than-or-equal__m128 _mm_cmpge_ss(__m128 a, __m128 b)CMPSS for inequality__m128 _mm_cmpneq_ss(__m128 a, __m128 b)CMPSS for not-less-than__m128 _mm_cmpnlt_ss(__m128 a, __m128 b)CMPSS for not-greater-than__m128 _mm_cmpngt_ss(__m128 a, __m128 b)CMPSS for not-greater-than-or-equal__m128 _mm_cmpnge_ss(__m128 a, __m128 b)CMPSS for ordered__m128 _mm_cmpord_ss(__m128 a, __m128 b)CMPSS for unordered__m128 _mm_cmpunord_ss(__m128 a, __m128 b)CMPSS for not-less-than-or-equal__m128 _mm_cmpnle_ss(__m128 a, __m128 b)SIMD Floating-Point ExceptionsInvalid if SNaN operand, Invalid if QNaN and predicate as listed in above table,Denormal.Protected Mode Exceptions#GP(0)For an illegal memory operand effective address in the CS, DS,ES, FS or GS segments.#SS(0)For an illegal address in the SS segment.#PF(fault-code)For a page fault.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE[bit 25] = 0.#UDIf the LOCK prefix is used.#AC(0)If alignment checking is enabled and an unaligned memoryreference is made while the current privilege level is 3.Real-Address Mode Exceptions#GPIf any part of the operand lies outside the effective addressspace from 0 to FFFFH.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.CMPSS—Compare Scalar Single-Precision Floating-Point ValuesVol.