Volume 2B Instruction Set Reference N-Z (794102), страница 7
Текст из файла (страница 7)
Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.4-44 Vol. 2BPADDSB/PADDSW—Add Packed Signed Integers with Signed SaturationINSTRUCTION SET REFERENCE, N-ZIf the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.#UDIf CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0.
Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.PADDSB/PADDSW—Add Packed Signed Integers with Signed SaturationVol.
2B 4-45INSTRUCTION SET REFERENCE, N-Z#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.4-46 Vol. 2BPADDSB/PADDSW—Add Packed Signed Integers with Signed SaturationINSTRUCTION SET REFERENCE, N-ZPADDUSB/PADDUSW—Add Packed Unsigned Integers with UnsignedSaturationOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F DC /rPADDUSB mm,mm/m64ValidValidAdd packed unsigned byte integersfrom mm/m64 and mm andsaturate the results.66 0F DC /rPADDUSB xmm1,xmm2/m128ValidValidAdd packed unsigned byte integersfrom xmm2/m128 and xmm1saturate the results.0F DD /rPADDUSW mm,mm/m64ValidValidAdd packed unsigned wordintegers from mm/m64 and mmand saturate the results.66 0F DD /rPADDUSW xmm1,xmm2/m128ValidValidAdd packed unsigned wordintegers from xmm2/m128 toxmm1 and saturate the results.DescriptionPerforms a SIMD add of the packed unsigned integers from the source operand(second operand) and the destination operand (first operand), and stores the packedinteger results in the destination operand.
See Figure 9-4 in the Intel® 64 and IA-32Architectures Software Developer’s Manual, Volume 1, for an illustration of a SIMDoperation. Overflow is handled with unsigned saturation, as described in thefollowing paragraphs.These instructions can operate on either 64-bit or 128-bit operands. When operatingon 64-bit operands, the destination operand must be an MMX technology registerand the source operand can be either an MMX technology register or a 64-bitmemory location. When operating on 128-bit operands, the destination operandmust be an XMM register and the source operand can be either an XMM register or a128-bit memory location.The PADDUSB instruction adds packed unsigned byte integers. When an individualbyte result is beyond the range of an unsigned byte integer (that is, greater thanFFH), the saturated value of FFH is written to the destination operand.The PADDUSW instruction adds packed unsigned word integers. When an individualword result is beyond the range of an unsigned word integer (that is, greater thanFFFFH), the saturated value of FFFFH is written to the destination operand.In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15).PADDUSB/PADDUSW—Add Packed Unsigned Integers with Unsigned SaturationVol.
2B 4-47INSTRUCTION SET REFERENCE, N-ZOperationPADDUSB instruction with 64-bit operands:DEST[7:0] ← SaturateToUnsignedByte(DEST[7:0] + SRC (7:0] );(* Repeat add operation for 2nd through 7th bytes *)DEST[63:56] ← SaturateToUnsignedByte(DEST[63:56] + SRC[63:56]PADDUSB instruction with 128-bit operands:DEST[7:0] ← SaturateToUnsignedByte (DEST[7:0] + SRC[7:0]);(* Repeat add operation for 2nd through 14th bytes *)DEST[127:120] ← SaturateToUnSignedByte (DEST[127:120] + SRC[127:120]);PADDUSW instruction with 64-bit operands:DEST[15:0] ← SaturateToUnsignedWord(DEST[15:0] + SRC[15:0] );(* Repeat add operation for 2nd and 3rd words *)DEST[63:48] ← SaturateToUnsignedWord(DEST[63:48] + SRC[63:48] );PADDUSW instruction with 128-bit operands:DEST[15:0] ← SaturateToUnsignedWord (DEST[15:0] + SRC[15:0]);(* Repeat add operation for 2nd through 7th words *)DEST[127:112] ← SaturateToUnSignedWord (DEST[127:112] + SRC[127:112]);Intel C/C++ Compiler Intrinsic EquivalentsPADDUSB__m64 _mm_adds_pu8(__m64 m1, __m64 m2)PADDUSW__m64 _mm_adds_pu16(__m64 m1, __m64 m2)PADDUSB__m128i _mm_adds_epu8 ( __m128i a, __m128i b)PADDUSW__m128i _mm_adds_epu16 ( __m128i a, __m128i b)Flags AffectedNone.Numeric ExceptionsNone.Protected Mode Exceptions#GP(0)If a memory operand effective address is outside the CS, DS,ES, FS, or GS segment limit.(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.#SS(0)4-48 Vol.
2BIf a memory operand effective address is outside the SSsegment limit.PADDUSB/PADDUSW—Add Packed Unsigned Integers with Unsigned SaturationINSTRUCTION SET REFERENCE, N-Z#UDIf CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0. Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.#UDIf CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0.
Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.PADDUSB/PADDUSW—Add Packed Unsigned Integers with Unsigned SaturationVol. 2B 4-49INSTRUCTION SET REFERENCE, N-Z#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.4-50 Vol.
2BPADDUSB/PADDUSW—Add Packed Unsigned Integers with Unsigned SaturationINSTRUCTION SET REFERENCE, N-ZPALIGNR — Packed Align RightOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F 3A 0FPALIGNR mm1,mm2/m64, imm8ValidValidConcatenate destination and sourceoperands, extract byte-alignedresult shifted to the right byconstant into mm1.66 0F 3A 0FPALIGNR xmm1,xmm2/m128,imm8ValidValidConcatenate destination and sourceoperands, extract byte-alignedresult shifted to the right byconstant into xmm1DescriptionPALIGNR concatenates the destination operand (the first operand) and the sourceoperand (the second operand) into an intermediate composite, shifts the compositeat byte granularity to the right by a constant immediate, and extracts the rightaligned result into the destination.
The first and the second operands can be an MMXor an XMM register. The immediate value is considered unsigned. Immediate shiftcounts larger than the 2L (i.e. 32 for 128-bit operands, or 16 for 64-bit operands)produce a zero result. Both operands can be MMX register or XMM registers. Whenthe source operand is a 128-bit memory operand, the operand must be aligned on a16-byte boundary or a general-protection exception (#GP) will be generated.In 64-bit mode, use the REX prefix to access additional registers.OperationPALIGNR with 64-bit operands:temp1[127:0] = CONCATENATE(DEST,SRC)>>(imm8*8)DEST[63:0] = temp1[63:0]PALIGNR with 128-bit operands:temp1[255:0] = CONCATENATE(DEST,SRC)>>(imm8*8)DEST[127:0] = temp1[127:0]Intel C/C++ Compiler Intrinsic EquivalentsPALIGNR__m64 _mm_alignr_pi8 (__m64 a, __m64 b, int n)PALIGNR__m128i _mm_alignr_epi8 (__m128i a, __m128i b, int n)PALIGNR — Packed Align RightVol.