Volume 2B Instruction Set Reference N-Z (794102), страница 11
Текст из файла (страница 11)
2B 4-69INSTRUCTION SET REFERENCE, N-ZPCMPGTD instruction with 128-bit operands:IF DEST[31:0] > SRC[31:0]THEN DEST[31:0] ← FFFFFFFFH;ELSE DEST[31:0] ← 0; FI;(* Continue comparison of 2nd and 3rd doublewords in DEST and SRC *)IF DEST[127:96] > SRC[127:96]THEN DEST[127:96] ← FFFFFFFFH;ELSE DEST[127:96] ← 0; FI;Intel C/C++ Compiler Intrinsic EquivalentsPCMPGTB __m64 _mm_cmpgt_pi8 (__m64 m1, __m64 m2)PCMPGTW __m64 _mm_pcmpgt_pi16 (__m64 m1, __m64 m2)DCMPGTD __m64 _mm_pcmpgt_pi32 (__m64 m1, __m64 m2)PCMPGTB __m128i _mm_cmpgt_epi8 ( __m128i a, __m128i bPCMPGTW __m128i _mm_cmpgt_epi16 ( __m128i a, __m128i bDCMPGTD __m128i _mm_cmpgt_epi32 ( __m128i a, __m128i bFlags AffectedNone.Numeric ExceptionsNone.Protected Mode Exceptions#GP(0)If a memory operand effective address is outside the CS, DS,ES, FS, or GS segment limit.(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.#SS(0)If a memory operand effective address is outside the SSsegment limit.#UDIf CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0.
Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.4-70 Vol. 2BPCMPGTB/PCMPGTW/PCMPGTD—Compare Packed Signed Integers for Greater ThanINSTRUCTION SET REFERENCE, N-Z#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.#UDIf CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0. Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.PCMPGTB/PCMPGTW/PCMPGTD—Compare Packed Signed Integers for Greater ThanVol.
2B 4-71INSTRUCTION SET REFERENCE, N-Z#AC(0)4-72 Vol. 2B(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.PCMPGTB/PCMPGTW/PCMPGTD—Compare Packed Signed Integers for Greater ThanINSTRUCTION SET REFERENCE, N-ZPEXTRW—Extract WordOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F C5 /r ibPEXTRW r32,mm, imm8ValidValidExtract the word specified byimm8 from mm and move it tor32, bits 15-0. Zero-extend theresult.REX.W + 0F C5 /r ibPEXTRW r64,mm, imm8ValidN.E.Extract the word specified byimm8 from mm and move it tor64, bits 15-0. Zero-extend theresult.66 0F C5 /r ibPEXTRW r32,xmm, imm8ValidValidExtract the word specified byimm8 from xmm and move it tor32, bits 15-0.
Zero-extend theresult.66 REX.W 0F C5 /ribPEXTRW r64,xmm, imm8ValidN.E.Extract the word specified byimm8 from xmm and move it tor64, bits 15-0. Zero-extend theresult.DescriptionCopies the word in the source operand (second operand) specified by the countoperand (third operand) to the destination operand (first operand).
The sourceoperand can be an MMX technology register or an XMM register. The destinationoperand is the low word of a general-purpose register. The count operand is an 8-bitimmediate. When specifying a word location in an MMX technology register, the 2least-significant bits of the count operand specify the location; for an XMM register,the 3 least-significant bits specify the location.
The high word of the destinationoperand is cleared (set to all 0s).In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15, R8-15). Use of REX.W permits the use of64-bit general purpose registers.OperationIF (64-Bit Mode and REX.W used and 64-bit register selected)THENFOR (PEXTRW instruction with 64-bit source operand){ SEL ← COUNT AND 3H;TEMP ← (SRC >> (SEL ∗ 16)) AND FFFFH;r64[15:0] ← TEMP[15:0];r64[63:16] ← ZERO_FILL; };FOR (PEXTRW instruction with 128-bit source operand)PEXTRW—Extract WordVol. 2B 4-73INSTRUCTION SET REFERENCE, N-Z{ SEL ← COUNT AND 7H;TEMP ← (SRC >> (SEL ∗ 16)) AND FFFFH;r64[15:0] ← TEMP[15:0];r64[63:16] ← ZERO_FILL; }ELSEFOR (PEXTRW instruction with 64-bit source operand){ SEL ← COUNT AND 3H;TEMP ← (SRC >> (SEL ∗ 16)) AND FFFFH;r32[15:0] ← TEMP[15:0];r32[31:16] ← ZERO_FILL; };FOR (PEXTRW instruction with 128-bit source operand){ SEL ← COUNT AND 7H;TEMP ← (SRC >> (SEL ∗ 16)) AND FFFFH;r32[15:0] ← TEMP[15:0];r32[31:16] ← ZERO_FILL; };FI;Intel C/C++ Compiler Intrinsic EquivalentPEXTRWint _mm_extract_pi16 (__m64 a, int n)PEXTRWint _mm_extract_epi16 ( __m128i a, int imm)Flags AffectedNone.Numeric ExceptionsNone.Protected Mode Exceptions#GP(0)If a memory operand effective address is outside the CS, DS,ES, FS, or GS segment limit.#SS(0) If a memory operand effective address is outside the SSsegment limit.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.4-74 Vol.
2BPEXTRW—Extract WordINSTRUCTION SET REFERENCE, N-Z#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.PEXTRW—Extract WordVol.
2B 4-75INSTRUCTION SET REFERENCE, N-ZPHADDW/PHADDD — Packed Horizontal Add64-BitModeCompat/Leg ModeDescriptionPHADDW mm1,mm2/m64ValidValidAdd 16-bit signed integershorizontally, pack to MM1.66 0F 38 01 /rPHADDW xmm1,xmm2/m128ValidValidAdd 16-bit signed integershorizontally, pack to XMM1.0F 38 02 /rPHADDD mm1,mm2/m64ValidValidAdd 32-bit signed integershorizontally, pack to MM1.66 0F 38 02 /rPHADDD xmm1,xmm2/m128ValidValidAdd 32-bit signed integershorizontally, pack to XMM1.OpcodeInstruction0F 38 01 /rDescriptionPHADDW adds two adjacent 16-bit signed integers horizontally from the source anddestination operands and packs the 16-bit signed results to the destination operand(first operand).
PHADDD adds two adjacent 32-bit signed integers horizontally fromthe source and destination operands and packs the 32-bit signed results to the destination operand (first operand). Both operands can be MMX or XMM registers. Whenthe source operand is a 128-bit memory operand, the operand must be aligned on a16-byte boundary or a general-protection exception (#GP) will be generated.In 64-bit mode, use the REX prefix to access additional registers.OperationPHADDW with 64-bit operands:mm1[15-0] = mm1[31-16] + mm1[15-0];mm1[31-16] = mm1[63-48] + mm1[47-32];mm1[47-32] = mm2/m64[31-16] + mm2/m64[15-0];mm1[63-48] = mm2/m64[63-48] + mm2/m64[47-32];PHADDW with 128-bit operands :xmm1[15-0] = xmm1[31-16] + xmm1[15-0];xmm1[31-16] = xmm1[63-48] + xmm1[47-32];xmm1[47-32] = xmm1[95-80] + xmm1[79-64];xmm1[63-48] = xmm1[127-112] + xmm1[111-96];xmm1[79-64] = xmm2/m128[31-16] + xmm2/m128[15-0];xmm1[95-80] = xmm2/m128[63-48] + xmm2/m128[47-32];xmm1[111-96] = xmm2/m128[95-80] + xmm2/m128[79-64];xmm1[127-112] = xmm2/m128[127-112] + xmm2/m128[111-96];4-76 Vol.
2BPHADDW/PHADDD — Packed Horizontal AddINSTRUCTION SET REFERENCE, N-ZPHADDD with 64-bit operands :mm1[31-0] = mm1[63-32] + mm1[31-0];mm1[63-32] = mm2/m64[63-32] + mm2/m64[31-0];PHADDD with 128-bit operands:xmm1[31-0] = xmm1[63-32] + xmm1[31-0];xmm1[63-32] = xmm1[127-96] + xmm1[95-64];xmm1[95-64] = xmm2/m128[63-32] + xmm2/m128[31-0];xmm1[127-96] = xmm2/m128[127-96] + xmm2/m128[95-64];Intel C/C++ Compiler Intrinsic EquivalentsPHADDW__m64 _mm_hadd_pi16 (__m64 a, __m64 b)PHADDW__m128i _mm_hadd_epi16 (__m128i a, __m128i b)PHADDD__m64 _mm_hadd_pi32 (__m64 a, __m64 b)PHADDD__m128i _mm_hadd_epi32 (__m128i a, __m128i b)Protected Mode Exceptions#GP(0):If a memory operand effective address is outside the CS, DS,ES, FS or GS segments.(128-bit operations only) If not aligned on 16-byte boundary,regardless of segment.#SS(0)If a memory operand effective address is outside the SSsegment limit.#PF(fault-code)If a page fault occurs.#UDIf CR0.EM(bit 2)= 1.(128-bit operations only) If CR4.OSFXSR(bit 9) = 0.If CPUID.SSSE3(ECX bit 9) = 0.If the LOCK prefix is used.#NMIf TS bit in CR0 is set.#MF(64-bit operations only) If there is a pending x87 FPU exception.#AC(0)(64-bit operations only) If alignment checking is enabled andunaligned memory reference is made while the current privilegelevel is 3.Real Mode Exceptions#GP(0)If any part of the operand lies outside of the effective addressspace from 0 to 0FFFFH.(128-bit operations only) If not aligned on 16-byte boundary,regardless of segment.PHADDW/PHADDD — Packed Horizontal AddVol.