Volume 4 128-Bit Media Instructions (794098), страница 37
Текст из файла (страница 37)
3.09—July 2007PCMPGTDPacked Compare Greater Than SignedDoublewordsCompares corresponding packed signed 32-bit values in the first and second source operands andwrites the result of each comparison in the corresponding 32 bits of the destination (first source). Foreach pair of doublewords, if the value in the first source operand is greater than the value in the secondsource operand, the result is all 1s.
If the value in the first source operand is less than or equal to thevalue in the second source operand, the result is all 0s. The first source/destination operand is an XMMregister and the second source operand is another XMM register or 128-bit memory location.The PCMPGTD instruction is an SSE2 instruction. The presence of this instruction set is indicated bya CPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodePCMPGTD xmm1, xmm2/mem128DescriptionCompares packed signed 32-bit values in an XMMregister and an XMM register or 128-bit memorylocation.66 0F 66 /rxmm112796 95.64 63.xmm2/mem128.32 31012796 95.64 63.32 310.compareall 1s or 0scompareall 1s or 0spcmpgtd-128.epsRelated InstructionsPCMPEQB, PCMPEQD, PCMPEQW, PCMPGTB, PCMPGTWrFLAGS AffectedNoneMXCSR Flags AffectedNone268PCMPGTDInstruction Reference26568—Rev.
3.09—July 2007AMD64 TechnologyExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE2 instructions are not supported, as indicatedby EDX bit 26 of CPUID function 0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR support bit(OSFXSR) of CR4 is cleared to 0.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXThe memory operand was not aligned on a 16-byteboundary while MXCSR.MM was cleared to 0.Page fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled withMXCSR.MM set to 1.Invalid opcode, #UDGeneral protection, #GPXInstruction ReferencePCMPGTD269AMD64 Technology26568—Rev.
3.09—July 2007PCMPGTWPacked Compare Greater Than Signed WordsCompares corresponding packed signed 16-bit values in the first and second source operands andwrites the result of each comparison in the corresponding 16 bits of the destination (first source). Foreach pair of words, if the value in the first source operand is greater than the value in the second sourceoperand, the result is all 1s. If the value in the first source operand is less than or equal to the value inthe second source operand, the result is all 0s. The first source/destination operand is an XMM registerand the second source operand is another XMM register or 128-bit memory location.The PCMPGTW instruction is an SSE2 instruction.
The presence of this instruction set is indicated bya CPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodePCMPGTW xmm1, xmm2/mem128DescriptionCompares packed signed 16-bit values in an XMMregister and an XMM register or 128-bit memorylocation.66 0F 65 /rxmm1....xmm2/mem128..127 112 111 96 95 80 79 64 63 48 47 32 31 16 15.....0127 112 111 96 95 80 79 64 63 48 47 32 31 16 15......0.comparecompareall 1s or 0sall 1s or 0spcmpgtw-128.epsRelated InstructionsPCMPEQB, PCMPEQD, PCMPEQW, PCMPGTB, PCMPGTDrFLAGS AffectedNoneMXCSR Flags AffectedNone270PCMPGTWInstruction Reference26568—Rev.
3.09—July 2007AMD64 TechnologyExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE2 instructions are not supported, as indicatedby EDX bit 26 of CPUID function 0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR support bit(OSFXSR) of CR4 is cleared to 0.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XXThe memory operand was not aligned on a 16-byteboundary while MXCSR.MM was cleared to 0.Page fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled withMXCSR.MM set to 1.Invalid opcode, #UDGeneral protection, #GPXInstruction ReferencePCMPGTW271AMD64 Technology26568—Rev.
3.09—July 2007PEXTRWExtract Packed WordExtracts a 16-bit value from an XMM register, as selected by the immediate byte operand (as shown inTable 1-2) and writes it to the low-order word of a 32-bit general-purpose register, with zero-extensionto 32 bits.The PEXTRW instruction is an SSE instruction.
The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodePEXTRW reg32, xmm, imm8DescriptionExtracts a 16-bit value from an XMM register andwrites it to low-order 16 bits of a general-purposeregister.66 0F C5 /r ibreg323215xmm127 112 111 96 95 80 79 64 63 48 47 32 31 16 15000imm87 0muxpextrw-128.epsTable 1-2.Immediate-Byte Operand Encoding for 128-Bit PEXTRWImmediate-ByteBit Field2–0Value of Bit FieldSource Bits Extracted015–0131–16247–32363–48479–64595–806111–967127–112Related InstructionsPINSRW272PEXTRWInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyrFLAGS AffectedNoneMXCSR Flags AffectedNoneExceptionsExceptionInvalid opcode, #UDDevice not available,#NMInstruction ReferenceRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE instructions are not supported, as indicatedby EDX bit 25 in CPUID function 0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR supportbit (OSFXSR) of CR4 is cleared to 0.XXXThe task-switch bit (TS) of CR0 was set to 1.PEXTRW273AMD64 Technology26568—Rev.
3.09—July 2007PINSRWPacked Insert WordInserts a 16-bit value from the low-order word of a 32-bit general purpose register or a 16-bit memorylocation into an XMM register. The location in the destination register is selected by the immediatebyte operand, as shown in Table 1-3 on page 274. The other words in the destination register operandare not modified.The PINSRW instruction is an SSE instruction. The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodePINSRW xmm, reg32/mem16,imm8DescriptionInserts a 16-bit value from a general-purposeregister or memory location into an XMMregister.66 0F C4 /r ibxmmreg32/mem16127 112 111 96 95 80 79 64 63 48 47 32 31 16 15032015imm87 0select word position for insertpinsrw-128.epsTable 1-3.Immediate-Byte Operand Encoding for 128-Bit PINSRWImmediate-ByteBit Field2–0274Value of Bit FieldDestination Bits Filled015–0131–16247–32363–48479–64595–806111–967127–112PINSRWInstruction Reference26568—Rev.
3.09—July 2007AMD64 TechnologyRelated InstructionsPEXTRWrFLAGS AffectedNoneMXCSR Flags AffectedNoneExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE instructions are not supported, as indicatedby EDX bit 25 in CPUID function 0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR supportbit (OSFXSR) of CR4 is cleared to 0.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.Invalid opcode, #UDGeneral protection, #GPPage fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled.Instruction ReferencePINSRW275AMD64 Technology26568—Rev.
3.09—July 2007PMADDWDPacked Multiply Words and Add DoublewordsMultiplies each packed 16-bit signed value in the first source operand by the corresponding packed 16bit signed value in the second source operand, adds the adjacent intermediate 32-bit results of eachmultiplication (for example, the multiplication results for the adjacent bit fields 63–48 and 47–32, and31–16 and 15–0), and writes the 32-bit result of each addition in the corresponding doubleword of thedestination (first source).