Volume 4 128-Bit Media Instructions (794098), страница 33
Текст из файла (страница 33)
3.09—July 2007PACKSSWBPack with Saturation Signed Word to ByteConverts each 16-bit signed integer in the first and second source operands to an 8-bit signed integerand packs the converted values into bytes in the destination (first source). The first source/destinationoperand is an XMM register and the second source operand is another XMM register or 128-bitmemory location.Converted values from the first source operand are packed into the low-order bytes of the destination,and the converted values from the second source operand are packed into the high-order bytes of thedestination.The PACKSSWB instruction is an SSE2 instruction.
The presence of this instruction set is indicated bya CPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodePACKSSWB xmm1, xmm2/mem128DescriptionPacks 16-bit signed integers in an XMM registerand another XMM register or 128-bit memorylocation into 8-bit signed integers in an XMMregister.66 0F 63 /rxmm1xmm2/mem128127 112 111 96 95 80 79 64 63 48 47 32 31 16 15.....0127 112 111 96 95 80 79 64 63 48 47 32 31 16 15....convert..0.convert.127......64 63.....0packsswb-128.epsFor each packed value in the destination, if the value is larger than the largest signed 8-bit integer, it issaturated to 7Fh, and if the value is smaller than the smallest signed 8-bit integer, it is saturated to 80h.Related InstructionsPACKSSDW, PACKUSWBrFLAGS AffectedNone232PACKSSWBInstruction Reference26568—Rev.
3.09—July 2007AMD64 TechnologyMXCSR Flags AffectedNoneExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE2 instructions are not supported, asindicated by EDX bit 26 of CPUID function0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR supportbit (OSFXSR) of CR4 is cleared to 0.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXThe memory operand was not aligned on a 16-byteboundary while MXCSR.MM was cleared to 0.Page fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled and MXCSR.MMwas set to 1.Invalid opcode, #UDGeneral protection, #GPXInstruction ReferencePACKSSWB233AMD64 Technology26568—Rev.
3.09—July 2007PACKUSWBPack with Saturation Signed Word toUnsigned ByteConverts each 16-bit signed integer in the first and second source operands to an 8-bit unsigned integerand packs the converted values into bytes in the destination (first source). The first source/destinationoperand is an XMM register and the second source operand is another XMM register or 128-bitmemory location.Converted values from the first source operand are packed into the low-order bytes of the destination,and the converted values from the second source operand are packed into the high-order bytes of thedestination.The PACKUSWB instruction is an SSE2 instruction.
The presence of this instruction set is indicatedby a CPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodePACKUSWB xmm1, xmm2/mem128Description66 0F 67 /rPacks 16-bit signed integers in an XMM registerand another XMM register or 128-bit memorylocation into 8-bit unsigned integers in an XMMregister.xmm1xmm2/mem128127 112 111 96 95 80 79 64 63 48 47 32 31 16 15.....0127 112 111 96 95 80 79 64 63 48 47 32 31 16 15....convert..0.convert.127......64 63.....0packuswb-128.epsFor each packed value in the destination, if the value is larger than the largest unsigned 8-bit integer, itis saturated to FFh, and if the value is smaller than the smallest unsigned 8-bit integer, it is saturated to00h.Related InstructionsPACKSSDW, PACKSSWB234PACKUSWBInstruction Reference26568—Rev.
3.09—July 2007AMD64 TechnologyrFLAGS AffectedNoneMXCSR Flags AffectedNoneExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE2 instructions are not supported, asindicated by EDX bit 26 of CPUID function0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR supportbit (OSFXSR) of CR4 is cleared to 0.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXThe memory operand was not aligned on a 16-byteboundary while MXCSR.MM was cleared to 0.Page fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled and MXCSR.MMwas set to 1.Invalid opcode, #UDGeneral protection, #GPXInstruction ReferencePACKUSWB235AMD64 Technology26568—Rev.
3.09—July 2007PADDBPacked Add BytesAdds each packed 8-bit integer value in the first source operand to the corresponding packed 8-bitinteger in the second source operand and writes the integer result of each addition in the correspondingbyte of the destination (first source). The first source/destination operand is an XMM register and thesecond source operand is another XMM register or 128-bit memory location.The PADDB instruction is an SSE2 instruction. The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodePADDB xmm1, xmm2/mem128DescriptionAdds packed byte integer values in an XMM registerand another XMM register or 128-bit memory locationand writes the result in the destination XMM register.66 0F FC /rxmm1........xmm2/mem128......1270..............1270..............addaddpaddb-128.epsThis instruction operates on both signed and unsigned integers.
If the result overflows, the carry isignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 8 bits of eachresult are written in the destination.Related InstructionsPADDD, PADDQ, PADDSB, PADDSW, PADDUSB, PADDUSW, PADDWrFLAGS AffectedNoneMXCSR Flags AffectedNone236PADDBInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE2 instructions are not supported, asindicated by EDX bit 26 of CPUID function0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR supportbit (OSFXSR) of CR4 is cleared to 0.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXThe memory operand was not aligned on a 16-byteboundary while MXCSR.MM was cleared to 0.Page fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled and MXCSR.MMwas set to 1.Invalid opcode, #UDGeneral protection, #GPXInstruction ReferencePADDB237AMD64 Technology26568—Rev.
3.09—July 2007PADDDPacked Add DoublewordsAdds each packed 32-bit integer value in the first source operand to the corresponding packed 32-bitinteger in the second source operand and writes the integer result of each addition in the correspondingdoubleword of the destination (first source). The first source/destination operand is an XMM registerand the second source operand is another XMM register or 128-bit memory location.The PADDD instruction is an SSE2 instruction. The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodePADDD xmm1, xmm2/mem128Description66 0F FE /rAdds packed 32-bit integer values in an XMM registerand another XMM register or 128-bit memory locationand writes the result in the destination XMM register.xmm1.12796 95xmm2/mem128.64 63.32 310.12796 9564 63.32 310.addaddpaddd-128.epsThis instruction operates on both signed and unsigned integers.
If the result overflows, the carry isignored (neither the overflow nor carry bit in rFLAGS is set), and only the low-order 32 bits of eachresult are written in the destination.Related InstructionsPADDB, PADDQ, PADDSB, PADDSW, PADDUSB, PADDUSW, PADDWrFLAGS AffectedNoneMXCSR Flags AffectedNone238PADDDInstruction Reference26568—Rev.