Volume 2B Instruction Set Reference N-Z (794102), страница 29
Текст из файла (страница 29)
When an individual word result is less than zero, the saturated value of 0000H is written to thedestination operand.In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15).PSUBUSB/PSUBUSW—Subtract Packed Unsigned Integers with Unsigned SaturationVol. 2B 4-203INSTRUCTION SET REFERENCE, N-ZOperationPSUBUSB instruction with 64-bit operands:DEST[7:0] ← SaturateToUnsignedByte (DEST[7:0] − SRC (7:0] );(* Repeat add operation for 2nd through 7th bytes *)DEST[63:56] ← SaturateToUnsignedByte (DEST[63:56] − SRC[63:56];PSUBUSB instruction with 128-bit operands:DEST[7:0] ← SaturateToUnsignedByte (DEST[7:0] − SRC[7:0]);(* Repeat add operation for 2nd through 14th bytes *)DEST[127:120] ← SaturateToUnSignedByte (DEST[127:120] − SRC[127:120]);PSUBUSW instruction with 64-bit operands:DEST[15:0] ← SaturateToUnsignedWord (DEST[15:0] − SRC[15:0] );(* Repeat add operation for 2nd and 3rd words *)DEST[63:48] ← SaturateToUnsignedWord (DEST[63:48] − SRC[63:48] );PSUBUSW instruction with 128-bit operands:DEST[15:0] ← SaturateToUnsignedWord (DEST[15:0] − SRC[15:0]);(* Repeat add operation for 2nd through 7th words *)DEST[127:112] ← SaturateToUnSignedWord (DEST[127:112] − SRC[127:112]);Intel C/C++ Compiler Intrinsic EquivalentsPSUBUSB __m64 _mm_subs_pu8(__m64 m1, __m64 m2)PSUBUSB __m128i _mm_subs_epu8(__m128i m1, __m128i m2)PSUBUSW __m64 _mm_subs_pu16(__m64 m1, __m64 m2)PSUBUSW __m128i _mm_subs_epu16(__m128i m1, __m128i m2)Flags AffectedNone.Numeric ExceptionsNone.Protected Mode Exceptions#GP(0)If a memory operand effective address is outside the CS, DS,ES, FS, or GS segment limit.(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.#SS(0)If a memory operand effective address is outside the SSsegment limit.#UDIf CR0.EM[bit 2] = 1.4-204 Vol.
2BPSUBUSB/PSUBUSW—Subtract Packed Unsigned Integers with Unsigned SaturationINSTRUCTION SET REFERENCE, N-Z(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.Real-Address Mode Exceptions#GP(0)(128-bit operations only) If a memory operand is not aligned ona 16-byte boundary, regardless of segment.If any part of the operand lies outside of the effective addressspace from 0 to FFFFH.#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made.Compatibility Mode ExceptionsSame exceptions as in protected mode.64-Bit Mode Exceptions#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#GP(0)If the memory address is in a non-canonical form.(128-bit operations only) If memory operand is not aligned on a16-byte boundary, regardless of segment.PSUBUSB/PSUBUSW—Subtract Packed Unsigned Integers with Unsigned SaturationVol.
2B 4-205INSTRUCTION SET REFERENCE, N-Z#UDIf CR0.EM[bit 2] = 1.(128-bit operations only) If CR4.OSFXSR[bit 9] = 0.(128-bit operations only) If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.#NMIf CR0.TS[bit 3] = 1.#MF(64-bit operations only) If there is a pending x87 FPU exception.#PF(fault-code)If a page fault occurs.#AC(0)(64-bit operations only) If alignment checking is enabled and anunaligned memory reference is made while the current privilegelevel is 3.4-206 Vol. 2BPSUBUSB/PSUBUSW—Subtract Packed Unsigned Integers with Unsigned SaturationINSTRUCTION SET REFERENCE, N-ZPUNPCKHBW/PUNPCKHWD/PUNPCKHDQ/PUNPCKHQDQ— UnpackHigh DataOpcodeInstruction64-BitModeCompat/Leg ModeDescription0F 68 /rPUNPCKHBW mm,mm/m64ValidValidUnpack and interleave high-orderbytes from mm and mm/m64into mm.66 0F 68 /rPUNPCKHBW xmm1,xmm2/m128ValidValidUnpack and interleave high-orderbytes from xmm1 andxmm2/m128 into xmm1.0F 69 /rPUNPCKHWD mm,mm/m64ValidValidUnpack and interleave high-orderwords from mm and mm/m64into mm.66 0F 69 /rPUNPCKHWD xmm1,xmm2/m128ValidValidUnpack and interleave high-orderwords from xmm1 andxmm2/m128 into xmm1.0F 6A /rPUNPCKHDQ mm,mm/m64ValidValidUnpack and interleave high-orderdoublewords from mm andmm/m64 into mm.66 0F 6A /rPUNPCKHDQ xmm1,xmm2/m128ValidValidUnpack and interleave high-orderdoublewords from xmm1 andxmm2/m128 into xmm1.66 0F 6D /rPUNPCKHQDQ xmm1,xmm2/m128ValidValidUnpack and interleave high-orderquadwords from xmm1 andxmm2/m128 into xmm1.DescriptionUnpacks and interleaves the high-order data elements (bytes, words, doublewords,or quadwords) of the destination operand (first operand) and source operand(second operand) into the destination operand.
Figure 4-11 shows the unpack operation for bytes in 64-bit operands. The low-order data elements are ignored.PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ/PUNPCKHQDQ— Unpack High DataVol. 2B 4-207INSTRUCTION SET REFERENCE, N-ZSRC Y7 Y6Y5 Y4Y3 Y2Y1 Y0DEST Y7 X7 Y6X6 Y5X5 Y4X7 X6X5 X4X3 X2X1 X0 DESTX4Figure 4-11. PUNPCKHBW Instruction Operation Using 64-bit OperandsThe source operand can be an MMX technology register or a 64-bit memory location,or it can be an XMM register or a 128-bit memory location. The destination operandcan be an MMX technology register or an XMM register. When the source data comesfrom a 64-bit memory operand, the full 64-bit operand is accessed from memory, butthe instruction uses only the high-order 32 bits.
When the source data comes from a128-bit memory operand, an implementation may fetch only the appropriate 64 bits;however, alignment to a 16-byte boundary and normal segment checking will still beenforced.The PUNPCKHBW instruction interleaves the high-order bytes of the source anddestination operands, the PUNPCKHWD instruction interleaves the high-order wordsof the source and destination operands, the PUNPCKHDQ instruction interleaves thehigh-order doubleword (or doublewords) of the source and destination operands,and the PUNPCKHQDQ instruction interleaves the high-order quadwords of thesource and destination operands.These instructions can be used to convert bytes to words, words to doublewords,doublewords to quadwords, and quadwords to double quadwords, respectively, byplacing all 0s in the source operand.
Here, if the source operand contains all 0s, theresult (stored in the destination operand) contains zero extensions of the high-orderdata elements from the original value in the destination operand. For example, withthe PUNPCKHBW instruction the high-order bytes are zero extended (that is,unpacked into unsigned word integers), and with the PUNPCKHWD instruction, thehigh-order words are zero extended (unpacked into unsigned doubleword integers).In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction toaccess additional registers (XMM8-XMM15).OperationPUNPCKHBW instruction with 64-bit operands:DEST[7:0] ← DEST[39:32];DEST[15:8] ← SRC[39:32];DEST[23:16] ← DEST[47:40];DEST[31:24] ← SRC[47:40];4-208 Vol.
2BPUNPCKHBW/PUNPCKHWD/PUNPCKHDQ/PUNPCKHQDQ— Unpack High DataINSTRUCTION SET REFERENCE, N-ZDEST[39:32] ← DEST[55:48];DEST[47:40] ← SRC[55:48];DEST[55:48] ← DEST[63:56];DEST[63:56] ← SRC[63:56];PUNPCKHW instruction with 64-bit operands:DEST[15:0] ← DEST[47:32];DEST[31:16] ← SRC[47:32];DEST[47:32] ← DEST[63:48];DEST[63:48] ← SRC[63:48];PUNPCKHDQ instruction with 64-bit operands:DEST[31:0] ← DEST[63:32];DEST[63:32] ← SRC[63:32];PUNPCKHBW instruction with 128-bit operands:DEST[7:0]← DEST[71:64];DEST[15:8] ← SRC[71:64];DEST[23:16] ← DEST[79:72];DEST[31:24] ← SRC[79:72];DEST[39:32] ← DEST[87:80];DEST[47:40] ← SRC[87:80];DEST[55:48] ← DEST[95:88];DEST[63:56] ← SRC[95:88];DEST[71:64] ← DEST[103:96];DEST[79:72] ← SRC[103:96];DEST[87:80] ← DEST[111:104];DEST[95:88] ← SRC[111:104];DEST[103:96] ← DEST[119:112];DEST[111:104] ← SRC[119:112];DEST[119:112] ← DEST[127:120];DEST[127:120] ← SRC[127:120];PUNPCKHWD instruction with 128-bit operands:DEST[15:0] ← DEST[79:64];DEST[31:16] ← SRC[79:64];DEST[47:32] ← DEST[95:80];DEST[63:48] ← SRC[95:80];DEST[79:64] ← DEST[111:96];DEST[95:80] ← SRC[111:96];DEST[111:96] ← DEST[127:112];DEST[127:112] ← SRC[127:112];PUNPCKHDQ instruction with 128-bit operands:DEST[31:0] ← DEST[95:64];DEST[63:32] ← SRC[95:64];PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ/PUNPCKHQDQ— Unpack High DataVol.