Volume 2B Instruction Set Reference N-Z (794102), страница 16
Текст из файла (страница 16)
2BPMINUB—Minimum of Packed Unsigned Byte IntegersINSTRUCTION SET REFERENCE, N-Z#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.PMINUB—Minimum of Packed Unsigned Byte IntegersVol.
2B 4-109INSTRUCTION SET REFERENCE, N-ZPMOVMSKB—Move Byte MaskOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F D7 /rPMOVMSKBr32, mmValidValidMove a byte mask of mm tor32.REX.W + 0F D7 /rPMOVMSKBr64, mmValidN.E.Move a byte mask of mm tothe lower 32-bits of r64 andzero-fill the upper 32-bits.66 0F D7 /rPMOVMSKBr32, xmmValidValidMove a byte mask of xmmto r32.66 REX.W 0F D7 /rPMOVMSKBr64, xmmValidN.E.Move a byte mask of xmmto the lower 32-bits of r64and zero-fill the upper32-bits.DescriptionCreates a mask made up of the most significant bit of each byte of the sourceoperand (second operand) and stores the result in the low byte or word of the destination operand (first operand).
The source operand is an MMX technology register oran XMM register; the destination operand is a general-purpose register. When operating on 64-bit operands, the byte mask is 8 bits; when operating on 128-bit operands, the byte mask is 16-bits.In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15, R8-15). Use of REX.W permits the use of64 bit general purpose registers.OperationPMOVMSKB instruction with 64-bit source operand and r32:r32[0] ← SRC[7];r32[1] ← SRC[15];(* Repeat operation for bytes 2 through 6 *)r32[7] ← SRC[63];r32[31:8] ← ZERO_FILL;PMOVMSKB instruction with 128-bit source operand and r32:r32[0] ← SRC[7];r32[1] ← SRC[15];(* Repeat operation for bytes 2 through 14 *)r32[15] ← SRC[127];r32[31:16] ← ZERO_FILL;4-110 Vol.
2BPMOVMSKB—Move Byte MaskINSTRUCTION SET REFERENCE, N-ZPMOVMSKB instruction with 64-bit source operand and r64:r64[0] ← SRC[7];r64[1] ← SRC[15];(* Repeat operation for bytes 2 through 6 *)r64[7] ← SRC[63];r64[63:8] ← ZERO_FILL;PMOVMSKB instruction with 128-bit source operand and r64:r64[0] ← SRC[7];r64[1] ← SRC[15];(* Repeat operation for bytes 2 through 14 *)r64[15] ← SRC[127];r64[63:16] ← ZERO_FILL;Intel C/C++ Compiler Intrinsic EquivalentPMOVMSKBint _mm_movemask_pi8(__m64 a)PMOVMSKBint _mm_movemask_epi8 ( __m128i a)Flags AffectedNone.Numeric ExceptionsNone.Protected Mode Exceptions#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Real-Address Mode ExceptionsSame exceptions as in protected mode.Virtual-8086 Mode ExceptionsSame exceptions as in protected mode.PMOVMSKB—Move Byte MaskVol.
2B 4-111INSTRUCTION SET REFERENCE, N-ZCompatibility Mode ExceptionsSame exceptions as in protected mode.64-Bit Mode ExceptionsSame exceptions as in protected mode.4-112 Vol. 2BPMOVMSKB—Move Byte MaskINSTRUCTION SET REFERENCE, N-ZPMULHRSW — Packed Multiply High with Round and ScaleOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F 38 0B /rPMULHRSWmm1, mm2/m64ValidValidMultiply 16-bit signedwords, scale and roundsigned doublewords, packhigh 16 bits to MM1.66 0F 38 0B /rPMULHRSWxmm1,xmm2/m128ValidValidMultiply 16-bit signedwords, scale and roundsigned doublewords, packhigh 16 bits to XMM1.DescriptionPMULHRSW multiplies vertically each signed 16-bit integer from the destinationoperand (first operand) with the corresponding signed 16-bit integer of the sourceoperand (second operand), producing intermediate, signed 32-bit integers.
Eachintermediate 32-bit integer is truncated to the 18 most significant bits. Rounding isalways performed by adding 1 to the least significant bit of the 18-bit intermediateresult. The final result is obtained by selecting the 16 bits immediately to the right ofthe most significant bit of each 18-bit intermediate result and packed to the destination operand. Both operands can be MMX register or XMM registers.When the source operand is a 128-bit memory operand, the operand must be alignedon a 16-byte boundary or a general-protection exception (#GP) will be generated.In 64-bit mode, use the REX prefix to access additional registers.OperationPMULHRSW with 64-bit operands:temp0[31:0] = INT32 ((DEST[15:0] * SRC[15:0]) >>14) + 1;temp1[31:0] = INT32 ((DEST[31:16] * SRC[31:16]) >>14) + 1;temp2[31:0] = INT32 ((DEST[47:32] * SRC[47:32]) >> 14) + 1;temp3[31:0] = INT32 ((DEST[63:48] * SRc[63:48]) >> 14) + 1;DEST[15:0] = temp0[16:1];DEST[31:15] = temp1[16:1];DEST[47:32] = temp2[16:1];DEST[63:48] = temp3[16:1];PMULHRSW with 128-bit operand:temp0[31:0] = INT32 ((DEST[15:0] * SRC[15:0]) >>14) + 1;temp1[31:0] = INT32 ((DEST[31:16] * SRC[31:16]) >>14) + 1;temp2[31:0] = INT32 ((DEST[47:32] * SRC[47:32]) >>14) + 1;PMULHRSW — Packed Multiply High with Round and ScaleVol.
2B 4-113INSTRUCTION SET REFERENCE, N-Ztemp3[31:0] = INT32 ((DEST[63:48] * SRC[63:48]) >>14) + 1;temp4[31:0] = INT32 ((DEST[79:64] * SRC[79:64]) >>14) + 1;temp5[31:0] = INT32 ((DEST[95:80] * SRC[95:80]) >>14) + 1;temp6[31:0] = INT32 ((DEST[111:96] * SRC[111:96]) >>14) + 1;temp7[31:0] = INT32 ((DEST[127:112] * SRC[127:112) >>14) + 1;DEST[15:0] = temp0[16:1];DEST[31:15] = temp1[16:1];DEST[47:32] = temp2[16:1];DEST[63:48] = temp3[16:1];DEST[79:64] = temp4[16:1];DEST[95:80] = temp5[16:1];DEST[111:96] = temp6[16:1];DEST[127:112] = temp7[16:1];Intel C/C++ Compiler Intrinsic EquivalentsPMULHRSW__m64 _mm_mulhrs_pi16 (__m64 a, __m64 b)PMULHRSW__m128i _mm_mulhrs_epi16 (__m128i a, __m128i b)Protected Mode Exceptions#GP(0)If a memory operand effective address is outside the CS, DS,ES, FS or GS segments.(128-bit operations only) If not aligned on 16-byte boundary,regardless of segment.#SS(0)If a memory operand effective address is outside the SSsegment limit.#PF(fault-code)If a page fault occurs.#UDIf CR0.EM = 1.(128-bit operations only) If CR4.OSFXSR(bit 9) = 0.If CPUID.SSSE3(ECX bit 9) = 0.If the LOCK prefix is used.#NMIf TS bit in CR0 is set.#MF(64-bit operations only) If there is a pending x87 FPU exception.#AC(0)(64-bit operations only) If alignment checking is enabled andunaligned memory reference is made while the current privilegelevel is 3.Real Mode Exceptions#GP(0)If any part of the operand lies outside of the effective addressspace from 0 to 0FFFFH.(128-bit operations only) If not aligned on 16-byte boundary,regardless of segment.4-114 Vol.
2BPMULHRSW — Packed Multiply High with Round and ScaleINSTRUCTION SET REFERENCE, N-Z#UDIf CR0.EM = 1.(128-bit operations only) If CR4.OSFXSR(bit 9) = 0.If CPUID.SSSE3(ECX bit 9) = 0.If the LOCK prefix is used.#NMIf TS bit in CR0 is set.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual 8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled andunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:ECX.SSSE3[bit 9] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.PMULHRSW — Packed Multiply High with Round and ScaleVol.
2B 4-115INSTRUCTION SET REFERENCE, N-ZPMULHUW—Multiply Packed Unsigned Integers and Store High ResultOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F E4 /rPMULHUW mm1,mm2/m64ValidValidMultiply the packed unsignedword integers in mm1 registerand mm2/m64, and store thehigh 16 bits of the results inmm1.66 0F E4 /rPMULHUW xmm1,xmm2/m128ValidValidMultiply the packed unsignedword integers in xmm1 andxmm2/m128, and store the high16 bits of the results in xmm1.DescriptionPerforms a SIMD unsigned multiply of the packed unsigned word integers in thedestination operand (first operand) and the source operand (second operand), andstores the high 16 bits of each 32-bit intermediate results in the destination operand.(Figure 4-3 shows this operation when using 64-bit operands.) The source operandcan be an MMX technology register or a 64-bit memory location, or it can be an XMMregister or a 128-bit memory location. The destination operand can be an MMX technology register or an XMM register.In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15).SRCDESTTEMPZ3 = X3 ∗ Y3DESTX3Y3X2Y2Z2 = X2 ∗ Y2X1Y1X0Y0Z1 = X1 ∗ Y1Z0 = X0 ∗ Y0Z3[31:16] Z2[31:16] Z1[31:16] Z0[31:16]Figure 4-3.