Volume 2B Instruction Set Reference N-Z (794102), страница 14
Текст из файла (страница 14)
2BPMADDUBSW — Multiply and Add Packed Signed and Unsigned BytesINSTRUCTION SET REFERENCE, N-Z#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled andunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:ECX.SSSE3[bit 9] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.PMADDUBSW — Multiply and Add Packed Signed and Unsigned BytesVol.
2B 4-93INSTRUCTION SET REFERENCE, N-ZPMADDWD—Multiply and Add Packed IntegersOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F F5 /rPMADDWD mm,mm/m64ValidValidMultiply the packed words in mmby the packed words in mm/m64,add adjacent doubleword results,and store in mm.66 0F F5 /rPMADDWD xmm1,xmm2/m128ValidValidMultiply the packed word integersin xmm1 by the packed wordintegers in xmm2/m128, addadjacent doubleword results, andstore in xmm1.DescriptionMultiplies the individual signed words of the destination operand (first operand) bythe corresponding signed words of the source operand (second operand), producingtemporary signed, doubleword results. The adjacent doubleword results are thensummed and stored in the destination operand.
For example, the corresponding loworder words (15-0) and (31-16) in the source and destination operands are multiplied by one another and the doubleword results are added together and stored inthe low doubleword of the destination register (31-0). The same operation isperformed on the other pairs of adjacent words. (Figure 4-2 shows this operationwhen using 64-bit operands.) The source operand can be an MMX technology registeror a 64-bit memory location, or it can be an XMM register or a 128-bit memory location. The destination operand can be an MMX technology register or an XMM register.The PMADDWD instruction wraps around only in one situation: when the 2 pairs ofwords being operated on in a group are all 8000H. In this case, the result wrapsaround to 80000000H.In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15).4-94 Vol.
2BPMADDWD—Multiply and Add Packed IntegersINSTRUCTION SET REFERENCE, N-ZSRCDESTTEMPX3Y3X3 ∗ Y3DESTX2Y2X2 ∗ Y2X1X0Y1Y0X1 ∗ Y1X0 ∗ Y0(X3∗Y3) + (X2∗Y2) (X1∗Y1) + (X0∗Y0)Figure 4-2. PMADDWD Execution Model Using 64-bit OperandsOperationPMADDWD instruction with 64-bit operands:DEST[31:0] ← (DEST[15:0] ∗ SRC[15:0]) + (DEST[31:16] ∗ SRC[31:16]);DEST[63:32] ← (DEST[47:32] ∗ SRC[47:32]) + (DEST[63:48] ∗ SRC[63:48]);PMADDWD instruction with 128-bit operands:DEST[31:0] ← (DEST[15:0] ∗ SRC[15:0]) + (DEST[31:16] ∗ SRC[31:16]);DEST[63:32] ← (DEST[47:32] ∗ SRC[47:32]) + (DEST[63:48] ∗ SRC[63:48]);DEST[95:64] ← (DEST[79:64] ∗ SRC[79:64]) + (DEST[95:80] ∗ SRC[95:80]);DEST[127:96] ← (DEST[111:96] ∗ SRC[111:96]) + (DEST[127:112] ∗ SRC[127:112]);Intel C/C++ Compiler Intrinsic EquivalentPMADDWD __m64 _mm_madd_pi16(__m64 m1, __m64 m2)PMADDWD __m128i _mm_madd_epi16 ( __m128i a, __m128i b)Flags AffectedNone.Numeric ExceptionsNone.Protected Mode Exceptions#GP(0)If a memory operand effective address is outside the CS, DS,ES, FS, or GS segment limit.(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.PMADDWD—Multiply and Add Packed IntegersVol.
2B 4-95INSTRUCTION SET REFERENCE, N-Z#SS(0)#UDIf a memory operand effective address is outside the SSsegment limit.If CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0. Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.#UDIf CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0.
Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)4-96 Vol. 2BIf a memory address referencing the SS segment is in a noncanonical form.PMADDWD—Multiply and Add Packed IntegersINSTRUCTION SET REFERENCE, N-Z#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.PMADDWD—Multiply and Add Packed IntegersVol.
2B 4-97INSTRUCTION SET REFERENCE, N-ZPMAXSW—Maximum of Packed Signed Word IntegersOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F EE /rPMAXSW mm1,mm2/m64ValidValidCompare signed word integers inmm2/m64 and mm1 and returnmaximum values.66 0F EE /rPMAXSW xmm1,xmm2/m128ValidValidCompare signed word integers inxmm2/m128 and xmm1 and returnmaximum values.DescriptionPerforms a SIMD compare of the packed signed word integers in the destinationoperand (first operand) and the source operand (second operand), and returns themaximum value for each pair of word integers to the destination operand. The sourceoperand can be an MMX technology register or a 64-bit memory location, or it can bean XMM register or a 128-bit memory location.
The destination operand can be anMMX technology register or an XMM register.In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15).OperationPMAXSW instruction for 64-bit operands:IF DEST[15:0] > SRC[15:0]) THENDEST[15:0] ← DEST[15:0];ELSEDEST[15:0] ← SRC[15:0]; FI;(* Repeat operation for 2nd and 3rd words in source and destination operands *)IF DEST[63:48] > SRC[63:48]) THENDEST[63:48] ← DEST[63:48];ELSEDEST[63:48] ← SRC[63:48]; FI;PMAXSW instruction for 128-bit operands:IF DEST[15:0] > SRC[15:0]) THENDEST[15:0] ← DEST[15:0];ELSEDEST[15:0] ← SRC[15:0]; FI;(* Repeat operation for 2nd through 7th words in source and destination operands *)IF DEST[127:112] > SRC[127:112]) THENDEST[127:112] ← DEST[127:112];ELSE4-98 Vol. 2BPMAXSW—Maximum of Packed Signed Word IntegersINSTRUCTION SET REFERENCE, N-ZDEST[127:112] ← SRC[127:112]; FI;Intel C/C++ Compiler Intrinsic EquivalentPMAXSW __m64 _mm_max_pi16(__m64 a, __m64 b)PMAXSW __m128i _mm_max_epi16 ( __m128i a, __m128i b)Flags AffectedNone.Numeric ExceptionsNone.Protected Mode Exceptions#GP(0)If a memory operand effective address is outside the CS, DS,ES, FS, or GS segment limit.(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.#SS(0)If a memory operand effective address is outside the SSsegment limit.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.PMAXSW—Maximum of Packed Signed Word IntegersVol.
2B 4-99INSTRUCTION SET REFERENCE, N-Z#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)#GP(0)If a memory address referencing the SS segment is in a noncanonical form.If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.4-100 Vol.
2BPMAXSW—Maximum of Packed Signed Word IntegersINSTRUCTION SET REFERENCE, N-ZPMAXUB—Maximum of Packed Unsigned Byte IntegersOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F DE /rPMAXUB mm1,mm2/m64ValidValidCompare unsigned byte integersin mm2/m64 and mm1 andreturns maximum values.66 0F DE /rPMAXUB xmm1,xmm2/m128ValidValidCompare unsigned byte integersin xmm2/m128 and xmm1 andreturns maximum values.DescriptionPerforms a SIMD compare of the packed unsigned byte integers in the destinationoperand (first operand) and the source operand (second operand), and returns themaximum value for each pair of byte integers to the destination operand.