Volume 2B Instruction Set Reference N-Z (794102), страница 5
Текст из файла (страница 5)
2B 4-29INSTRUCTION SET REFERENCE, N-Z#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.4-30 Vol.
2BPACKSSWB/PACKSSDW—Pack with Signed SaturationINSTRUCTION SET REFERENCE, N-Z#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.PACKSSWB/PACKSSDW—Pack with Signed SaturationVol. 2B 4-31INSTRUCTION SET REFERENCE, N-ZPACKUSWB—Pack with Unsigned SaturationOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F 67 /rPACKUSWB mm,mm/m64ValidValidConverts 4 signed word integersfrom mm and 4 signed wordintegers from mm/m64 into 8unsigned byte integers in mm usingunsigned saturation.66 0F 67 /rPACKUSWB xmm1,xmm2/m128ValidValidConverts 8 signed word integersfrom xmm1 and 8 signed wordintegers from xmm2/m128 into 16unsigned byte integers in xmm1using unsigned saturation.DescriptionConverts 4 or 8 signed word integers from the destination operand (first operand)and 4 or 8 signed word integers from the source operand (second operand) into 8 or16 unsigned byte integers and stores the result in the destination operand.
(SeeFigure 4-1 for an example of the packing operation.) If a signed word integer value isbeyond the range of an unsigned byte integer (that is, greater than FFH or less than00H), the saturated unsigned byte integer value of FFH or 00H, respectively, is storedin the destination.The PACKUSWB instruction operates on either 64-bit or 128-bit operands. Whenoperating on 64-bit operands, the destination operand must be an MMX technologyregister and the source operand can be either an MMX technology register or a 64-bitmemory location.
When operating on 128-bit operands, the destination operandmust be an XMM register and the source operand can be either an XMM register or a128-bit memory location.In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15).OperationPACKUSWB instruction with 64-bit operands:DEST[7:0] ← SaturateSignedWordToUnsignedByte DEST[15:0];DEST[15:8] ← SaturateSignedWordToUnsignedByte DEST[31:16];DEST[23:16] ← SaturateSignedWordToUnsignedByte DEST[47:32];DEST[31:24] ← SaturateSignedWordToUnsignedByte DEST[63:48];DEST[39:32] ← SaturateSignedWordToUnsignedByte SRC[15:0];DEST[47:40] ← SaturateSignedWordToUnsignedByte SRC[31:16];DEST[55:48] ← SaturateSignedWordToUnsignedByte SRC[47:32];DEST[63:56] ← SaturateSignedWordToUnsignedByte SRC[63:48];4-32 Vol. 2BPACKUSWB—Pack with Unsigned SaturationINSTRUCTION SET REFERENCE, N-ZPACKUSWB instruction with 128-bit operands:DEST[7:0]← SaturateSignedWordToUnsignedByte (DEST[15:0]);DEST[15:8] ← SaturateSignedWordToUnsignedByte (DEST[31:16]);DEST[23:16] ← SaturateSignedWordToUnsignedByte (DEST[47:32]);DEST[31:24] ← SaturateSignedWordToUnsignedByte (DEST[63:48]);DEST[39:32] ← SaturateSignedWordToUnsignedByte (DEST[79:64]);DEST[47:40] ← SaturateSignedWordToUnsignedByte (DEST[95:80]);DEST[55:48] ← SaturateSignedWordToUnsignedByte (DEST[111:96]);DEST[63:56] ← SaturateSignedWordToUnsignedByte (DEST[127:112]);DEST[71:64] ← SaturateSignedWordToUnsignedByte (SRC[15:0]);DEST[79:72] ← SaturateSignedWordToUnsignedByte (SRC[31:16]);DEST[87:80] ← SaturateSignedWordToUnsignedByte (SRC[47:32]);DEST[95:88] ← SaturateSignedWordToUnsignedByte (SRC[63:48]);DEST[103:96] ← SaturateSignedWordToUnsignedByte (SRC[79:64]);DEST[111:104] ← SaturateSignedWordToUnsignedByte (SRC[95:80]);DEST[119:112] ← SaturateSignedWordToUnsignedByte (SRC[111:96]);DEST[127:120] ← SaturateSignedWordToUnsignedByte (SRC[127:112]);Intel C/C++ Compiler Intrinsic EquivalentPACKUSWB__m64 _mm_packs_pu16(__m64 m1, __m64 m2)Flags AffectedNone.Protected Mode Exceptions#GP(0)If a memory operand effective address is outside the CS, DS,ES, FS, or GS segment limit.(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.#SS(0)#UDIf a memory operand effective address is outside the SSsegment limit.If CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0.
Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.PACKUSWB—Pack with Unsigned SaturationVol.
2B 4-33INSTRUCTION SET REFERENCE, N-Z#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.#UDIf CR0.EM[bit 2] = 1.128-bit operations will generate #UD only if CR4.OSFXSR[bit 9]= 0.
Execution of 128-bit instructions on a non-SSE2 capableprocessor (one that is MMX technology capable) will result in theinstruction operating on the mm registers, not #UD.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame as for protected mode exceptions.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.4-34 Vol.
2BPACKUSWB—Pack with Unsigned SaturationINSTRUCTION SET REFERENCE, N-Z#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.PACKUSWB—Pack with Unsigned SaturationVol. 2B 4-35INSTRUCTION SET REFERENCE, N-ZPADDB/PADDW/PADDD—Add Packed IntegersOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F FC /rPADDB mm,mm/m64ValidValidAdd packed byte integers frommm/m64 and mm.66 0F FC /rPADDB xmm1,xmm2/m128ValidValidAdd packed byte integers fromxmm2/m128 and xmm1.0F FD /rPADDW mm,mm/m64ValidValidAdd packed word integers frommm/m64 and mm.66 0F FD /rPADDW xmm1,xmm2/m128ValidValidAdd packed word integers fromxmm2/m128 and xmm1.0F FE /rPADDD mm,mm/m64ValidValidAdd packed doubleword integers frommm/m64 and mm.66 0F FE /rPADDD xmm1,xmm2/m128ValidValidAdd packed doubleword integers fromxmm2/m128 and xmm1.DescriptionPerforms a SIMD add of the packed integers from the source operand (secondoperand) and the destination operand (first operand), and stores the packed integerresults in the destination operand.
See Figure 9-4 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, for an illustration of a SIMD operation. Overflow is handled with wraparound, as described in the following paragraphs.These instructions can operate on either 64-bit or 128-bit operands.
When operatingon 64-bit operands, the destination operand must be an MMX technology registerand the source operand can be either an MMX technology register or a 64-bitmemory location. When operating on 128-bit operands, the destination operandmust be an XMM register and the source operand can be either an XMM register or a128-bit memory location.The PADDB instruction adds packed byte integers. When an individual result is toolarge to be represented in 8 bits (overflow), the result is wrapped around and the low8 bits are written to the destination operand (that is, the carry is ignored).The PADDW instruction adds packed word integers. When an individual result is toolarge to be represented in 16 bits (overflow), the result is wrapped around and thelow 16 bits are written to the destination operand.The PADDD instruction adds packed doubleword integers.
When an individual resultis too large to be represented in 32 bits (overflow), the result is wrapped around andthe low 32 bits are written to the destination operand.Note that the PADDB, PADDW, and PADDD instructions can operate on eitherunsigned or signed (two's complement notation) packed integers; however, it doesnot set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent4-36 Vol. 2BPADDB/PADDW/PADDD—Add Packed IntegersINSTRUCTION SET REFERENCE, N-Zundetected overflow conditions, software must control the ranges of values operatedon.In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15).OperationPADDB instruction with 64-bit operands:DEST[7:0] ← DEST[7:0] + SRC[7:0];(* Repeat add operation for 2nd through 7th byte *)DEST[63:56] ← DEST[63:56] + SRC[63:56];PADDB instruction with 128-bit operands:DEST[7:0] ← DEST[7:0] + SRC[7:0];(* Repeat add operation for 2nd through 14th byte *)DEST[127:120] ← DEST[111:120] + SRC[127:120];PADDW instruction with 64-bit operands:DEST[15:0] ← DEST[15:0] + SRC[15:0];(* Repeat add operation for 2nd and 3th word *)DEST[63:48] ← DEST[63:48] + SRC[63:48];PADDW instruction with 128-bit operands:DEST[15:0] ← DEST[15:0] + SRC[15:0];(* Repeat add operation for 2nd through 7th word *)DEST[127:112] ← DEST[127:112] + SRC[127:112];PADDD instruction with 64-bit operands:DEST[31:0] ← DEST[31:0] + SRC[31:0];DEST[63:32] ← DEST[63:32] + SRC[63:32];PADDD instruction with 128-bit operands:DEST[31:0] ← DEST[31:0] + SRC[31:0];(* Repeat add operation for 2nd and 3th doubleword *)DEST[127:96] ← DEST[127:96] + SRC[127:96];Intel C/C++ Compiler Intrinsic EquivalentsPADDB__m64 _mm_add_pi8(__m64 m1, __m64 m2)PADDB__m128i _mm_add_epi8 (__m128ia,__m128ib )PADDW__m64 _mm_addw_pi16(__m64 m1, __m64 m2)PADDW__m128i _mm_add_epi16 ( __m128i a, __m128i b)PADDD__m64 _mm_add_pi32(__m64 m1, __m64 m2)PADDD__m128i _mm_add_epi32 ( __m128i a, __m128i b)PADDB/PADDW/PADDD—Add Packed IntegersVol.