Volume 2A Instruction Set Reference A-M (794101), страница 39
Текст из файла (страница 39)
These comparisons can be made either by using the inverse relationship (that is, use the “not-lessthan-or-equal” to make a “greater-than” comparison) or by using software emulation. When using software emulation, the program must swap the operands (copyingregisters when necessary to protect the data that will now be in the destination), andthen perform the compare using a different predicate. The predicate to be used forthese emulations is listed in Table 3-7 under the heading Emulation.Compilers and assemblers may implement the following two-operand pseudo-ops inaddition to the three-operand CMPPD instruction. See Table 3-7.Table 3-8.
Pseudo-Op and CMPPD Implementation:Pseudo-OpCMPPD ImplementationCMPEQPD xmm1, xmm2CMPPD xmm1, xmm2, 0CMPLTPD xmm1, xmm2CMPPD xmm1, xmm2, 1CMPLEPD xmm1, xmm2CMPPD xmm1, xmm2, 23-162 Vol. 2ACMPPD—Compare Packed Double-Precision Floating-Point ValuesINSTRUCTION SET REFERENCE, A-MTable 3-8. Pseudo-Op and CMPPD ImplementationPseudo-OpCMPPD ImplementationCMPUNORDPD xmm1, xmm2CMPPD xmm1, xmm2, 3CMPNEQPD xmm1, xmm2CMPPD xmm1, xmm2, 4CMPNLTPD xmm1, xmm2CMPPD xmm1, xmm2, 5CMPNLEPD xmm1, xmm2CMPPD xmm1, xmm2, 6CMPORDPD xmm1, xmm2CMPPD xmm1, xmm2, 7The greater-than relations that the processor does not implement require more thanone instruction to emulate in software and therefore should not be implemented aspseudo-ops.
(For these, the programmer should reverse the operands of the corresponding less than relations and use move instructions to ensure that the mask ismoved to the correct destination register and that the source operand is left intact.)In 64-bit mode, use of the REX.R prefix permits this instruction to access additionalregisters (XMM8-XMM15).OperationCASE (COMPARISON PREDICATE) OF0: OP ← EQ;1: OP ← LT;2: OP ← LE;3: OP ← UNORD;4: OP ← NEQ;5: OP ← NLT;6: OP ← NLE;7: OP ← ORD;DEFAULT: Reserved;CMP0 ← DEST[63:0] OP SRC[63:0];CMP1 ← DEST[127:64] OP SRC[127:64];IF CMP0 = TRUETHEN DEST[63:0] ← FFFFFFFFFFFFFFFFH;ELSE DEST[63:0] ← 0000000000000000H; FI;IF CMP1 = TRUETHEN DEST[127:64] ← FFFFFFFFFFFFFFFFH;ELSE DEST[127:64] ← 0000000000000000H; FI;Intel C/C++ Compiler Intrinsic EquivalentsCMPPD for equality__m128d _mm_cmpeq_pd(__m128d a, __m128d b)CMPPD for less-than__m128d _mm_cmplt_pd(__m128d a, __m128d b)CMPPD—Compare Packed Double-Precision Floating-Point ValuesVol.
2A 3-163INSTRUCTION SET REFERENCE, A-MCMPPD for less-than-or-equal__m128d _mm_cmple_pd(__m128d a, __m128d b)CMPPD for greater-than__m128d _mm_cmpgt_pd(__m128d a, __m128d b)CMPPD for greater-than-or-equal__m128d _mm_cmpge_pd(__m128d a, __m128d b)CMPPD for inequality__m128d _mm_cmpneq_pd(__m128d a, __m128d b)CMPPD for not-less-than__m128d _mm_cmpnlt_pd(__m128d a, __m128d b)CMPPD for not-greater-than__m128d _mm_cmpngt_pd(__m128d a, __m128d b)CMPPD for not-greater-than-or-equal__m128d _mm_cmpnge_pd(__m128d a, __m128d b)CMPPD for ordered__m128d _mm_cmpord_pd(__m128d a, __m128d b)CMPPD for unordered__m128d _mm_cmpunord_pd(__m128d a, __m128d b)CMPPD for not-less-than-or-equal__m128d _mm_cmpnle_pd(__m128d a, __m128d b)SIMD Floating-Point ExceptionsInvalid if SNaN operand and invalid if QNaN and predicate as listed in above table,Denormal.Protected Mode Exceptions#GP(0)For an illegal memory operand effective address in the CS, DS,ES, FS or GS segments.If a memory operand is not aligned on a 16-byte boundary,regardless of segment.#SS(0)For an illegal address in the SS segment.#PF(fault-code)For a page fault.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.Real-Address Mode Exceptions#GPIf a memory operand is not aligned on a 16-byte boundary,regardless of segment.If any part of the operand lies outside the effective addressspace from 0 to FFFFH.#NM3-164 Vol.
2AIf CR0.TS[bit 3] = 1.CMPPD—Compare Packed Double-Precision Floating-Point ValuesINSTRUCTION SET REFERENCE, A-M#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.Compatibility Mode ExceptionsSame exceptions as in protected mode.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.If memory operand is not aligned on a 16-byte boundary,regardless of segment.#PF(fault-code)For a page fault.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.CMPPD—Compare Packed Double-Precision Floating-Point ValuesVol.
2A 3-165INSTRUCTION SET REFERENCE, A-MCMPPS—Compare Packed Single-Precision Floating-Point ValuesOpcodeInstructionOp/En64-BitModeCompat/ DescriptionLeg Mode0F C2 /r ibCMPPS xmm1,xmm2/m128, imm8AValidValidCompare packed singleprecision floating-pointvalues in xmm2/mem andxmm1 using imm8 ascomparison predicate.Instruction Operand EncodingOp/EnOperand 1Operand 2Operand 3Operand 4AModRM:reg (r, w)ModRM:r/m (r)imm8NADescriptionPerforms a SIMD compare of the four packed single-precision floating-point values inthe source operand (second operand) and the destination operand (first operand)and returns the results of the comparison to the destination operand.
The comparison predicate operand (third operand) specifies the type of comparison performedon each of the pairs of packed values. The result of each comparison is a doublewordmask of all 1s (comparison true) or all 0s (comparison false).The source operand can be an XMM register or a 128-bit memory location. The destination operand is an XMM register.
The comparison predicate operand is an 8-bitimmediate, the first 3 bits of which define the type of comparison to be made (seeTable 3-7). Bits 3 through 7 of the immediate are reserved.The unordered relationship is true when at least one of the two source operandsbeing compared is a NaN; the ordered relationship is true when neither sourceoperand is a NaN.A subsequent computational instruction that uses the mask result in the destinationoperand as an input operand will not generate a fault, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.Some of the comparisons listed in Table 3-7 (such as the greater-than, greater-thanor-equal, not-greater-than, and not-greater-than-or-equal relations) can be madeonly through software emulation.
For these comparisons the program must swap theoperands (copying registers when necessary to protect the data that will now be inthe destination), and then perform the compare using a different predicate. Thepredicate to be used for these emulations is listed in Table 3-7 under the headingEmulation.Compilers and assemblers may implement the following two-operand pseudo-ops inaddition to the three-operand CMPPS instruction. See Table 3-9.In 64-bit mode, use of the REX.R prefix permits this instruction to access additionalregisters (XMM8-XMM15).3-166 Vol.
2ACMPPS—Compare Packed Single-Precision Floating-Point ValuesINSTRUCTION SET REFERENCE, A-MTable 3-9. Pseudo-Ops and CMPPSPseudo-OpImplementationCMPEQPS xmm1, xmm2CMPPS xmm1, xmm2, 0CMPLTPS xmm1, xmm2CMPPS xmm1, xmm2, 1CMPLEPS xmm1, xmm2CMPPS xmm1, xmm2, 2CMPUNORDPS xmm1, xmm2CMPPS xmm1, xmm2, 3CMPNEQPS xmm1, xmm2CMPPS xmm1, xmm2, 4CMPNLTPS xmm1, xmm2CMPPS xmm1, xmm2, 5CMPNLEPS xmm1, xmm2CMPPS xmm1, xmm2, 6CMPORDPS xmm1, xmm2CMPPS xmm1, xmm2, 7The greater-than relations not implemented by the processor require more than oneinstruction to emulate in software and therefore should not be implemented aspseudo-ops. (For these, the programmer should reverse the operands of the corresponding less than relations and use move instructions to ensure that the mask ismoved to the correct destination register and that the source operand is left intact.)OperationCASE (COMPARISON PREDICATE) OF0: OP ← EQ;1: OP ← LT;2: OP ← LE;3: OP ← UNORD;4: OP ← NE;5: OP ← NLT;6: OP ← NLE;7: OP ← ORD;EASC;CMP0 ← DEST[31:0] OP SRC[31:0];CMP1 ← DEST[63:32] OP SRC[63:32];CMP2 ← DEST [95:64] OP SRC[95:64];CMP3 ← DEST[127:96] OP SRC[127:96];IF CMP0 = TRUETHEN DEST[31:0] ← FFFFFFFFH;ELSE DEST[31:0] ← 00000000H; FI;IF CMP1 = TRUETHEN DEST[63:32] ← FFFFFFFFH;CMPPS—Compare Packed Single-Precision Floating-Point ValuesVol.
2A 3-167INSTRUCTION SET REFERENCE, A-MELSE DEST[63:32] ← 00000000H; FI;IF CMP2 = TRUETHEN DEST95:64] ← FFFFFFFFH;ELSE DEST[95:64] ← 00000000H; FI;IF CMP3 = TRUETHEN DEST[127:96] ← FFFFFFFFH;ELSE DEST[127:96] ← 00000000H; FI;Intel C/C++ Compiler Intrinsic EquivalentsCMPPS for equality__m128 _mm_cmpeq_ps(__m128 a, __m128 b)CMPPS for less-than__m128 _mm_cmplt_ps(__m128 a, __m128 b)CMPPS for less-than-or-equal__m128 _mm_cmple_ps(__m128 a, __m128 b)CMPPS for greater-than__m128 _mm_cmpgt_ps(__m128 a, __m128 b)CMPPS for greater-than-or-equal__m128 _mm_cmpge_ps(__m128 a, __m128 b)CMPPS for inequality__m128 _mm_cmpneq_ps(__m128 a, __m128 b)CMPPS for not-less-than__m128 _mm_cmpnlt_ps(__m128 a, __m128 b)CMPPS for not-greater-than__m128 _mm_cmpngt_ps(__m128 a, __m128 b)CMPPS for not-greater-than-or-equal__m128 _mm_cmpnge_ps(__m128 a, __m128 b)CMPPS for ordered__m128 _mm_cmpord_ps(__m128 a, __m128 b)CMPPS for unordered__m128 _mm_cmpunord_ps(__m128 a, __m128 b)CMPPS for not-less-than-or-equal__m128 _mm_cmpnle_ps(__m128 a, __m128 b)SIMD Floating-Point ExceptionsInvalid if SNaN operand and invalid if QNaN and predicate as listed in above table,Denormal.Protected Mode Exceptions#GP(0)For an illegal memory operand effective address in the CS, DS,ES, FS or GS segments.If a memory operand is not aligned on a 16-byte boundary,regardless of segment.#SS(0)For an illegal address in the SS segment.#PF(fault-code)For a page fault.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.3-168 Vol.
2ACMPPS—Compare Packed Single-Precision Floating-Point ValuesINSTRUCTION SET REFERENCE, A-MIf CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE[bit 25] = 0.If the LOCK prefix is used.Real-Address Mode Exceptions#GPIf a memory operand is not aligned on a 16-byte boundary,regardless of segment.If any part of the operand lies outside the effective addressspace from 0 to FFFFH.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.#UDIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 0.If CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE[bit 25] = 0.If the LOCK prefix is used.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.Compatibility Mode ExceptionsSame exceptions as in protected mode.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.If memory operand is not aligned on a 16-byte boundary,regardless of segment.#PF(fault-code)For a page fault.#NMIf CR0.TS[bit 3] = 1.#XMIf an unmasked SIMD floating-point exception and CR4.OSXMMEXCPT[bit 10] = 1.CMPPS—Compare Packed Single-Precision Floating-Point ValuesVol.