Volume 4 128-Bit Media Instructions (794098), страница 28
Текст из файла (страница 28)
3.09—July 2007AMD64 TechnologyrFLAGS AffectedNoneMXCSR Flags AffectedNoneExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE instructions are not supported, as indicatedby EDX bit 25 of CPUID function 0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR support bit(OSFXSR) of the control register (CR4) was cleared to0.Device not available,#NMXXXThe task-switch bit (TS) of CR0 was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XThe destination operand was in a non-writablesegment.Invalid opcode, #UDGeneral protection,#GPPage fault, #PFXXA page fault resulted from the execution of theinstruction.Alignment check, #ACXXAn unaligned memory reference was performed whilealignment checking was enabled.Instruction ReferenceMOVLPS181AMD64 TechnologyMOVMSKPD26568—Rev.
3.09—July 2007Extract Packed Double-Precision Floating-PointSign MaskMoves the sign bits of two packed double-precision floating-point values in an XMM register to thetwo low-order bits of a 32-bit general-purpose register, with zero-extension.The MOVMSKPD instruction is an SSE2 instruction. The presence of this instruction set is indicatedby a CPUID feature bit.
(See “CPUID” in Volume 3.)MnemonicOpcodeMOVMSKPD reg32, xmmDescription66 0F 50 /rMove sign bits in an XMM register to a 32-bit generalpurpose register.reg32xmm13101276300copy signcopy signmovmskpd.epsRelated InstructionsMOVMSKPS, PMOVMSKBrFLAGS AffectedNoneMXCSR Flags AffectedNone182MOVMSKPDInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyExceptionsException (vector)Invalid opcode, #UDDevice not available,#NMInstruction ReferenceRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE2 instructions are not supported, as indicatedby EDX bit 26 of CPUID function 0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR support bit(OSFXSR) of CR4 was cleared to 0.XXXThe task-switch bit (TS) of CR0 was set to 1.MOVMSKPD183AMD64 Technology26568—Rev.
3.09—July 2007MOVMSKPSExtract Packed Single-Precision Floating-PointSign MaskMoves the sign bits of four packed single-precision floating-point values in an XMM register to thefour low-order bits of a 32-bit general-purpose register, with zero-extension.The MOVMSKPS instruction is an SSE instruction. The presence of this instruction set is indicated bya CPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodeMOVMSKPS reg32, xmmDescriptionMove sign bits in an XMM register to a 32-bit general-purposeregister.0F 50 /rreg32xmm3310127956331copy signcopy signcopy signcopy sign00movmskps.epsRelated InstructionsMOVMSKPD, PMOVMSKBrFLAGS AffectedNoneMXCSR Flags AffectedNone184MOVMSKPSInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyExceptionsExceptionInvalid opcode, #UDDevice not available,#NMInstruction ReferenceRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE instructions are not supported, as indicated byEDX bit 25 of CPUID function 0000_0001h.XXXThe emulate bit (EM) of CR0 was set to 1.XXXThe operating-system FXSAVE/FXRSTOR support bit(OSFXSR) of CR4 was cleared to 0.XXXThe task-switch bit (TS) of CR0 was set to 1.MOVMSKPS185AMD64 Technology26568—Rev.
3.09—July 2007MOVNTDQMove Non-Temporal Double QuadwordStores a 128-bit (double quadword) XMM register value into a 128-bit memory location. Thisinstruction indicates to the processor that the data is non-temporal, and is unlikely to be used againsoon. The processor treats the store as a write-combining (WC) memory write, which minimizes cachepollution. The exact method by which cache pollution is minimized depends on the hardwareimplementation of the instruction. For further information, see “Memory Optimization” in Volume 1.MOVNTDQ is weakly-ordered with respect to other instructions that operate on memory. Softwareshould use an SFENCE instruction to force strong memory ordering of MOVNTDQ with respect toother stores.The MOVNTDQ instruction is an SSE2 instruction.
The presence of this instruction set is indicated bya CPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodeMOVNTDQ mem128, xmm66 0F E7 /rDescriptionStores a 128-bit XMM register value into a 128-bit memorylocation, minimizing cache pollution.mem128xmm12701270copymovntdq.epsRelated InstructionsMOVNTI, MOVNTPD, MOVNTPS, MOVNTQrFLAGS AffectedNoneMXCSR Flags AffectedNone186MOVNTDQInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE2 instructions are not supported, as indicatedby EDX bit 26 of CPUID function 0000_0001h.XXXThe emulate bit (CR0.EM) was set to 1.XXXThe operating-system FXSAVE/FXRSTOR support bit(CR4.OSFXSR) was cleared to 0.Device not available,#NMXXXThe task-switch bit (CR0.TS) was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XThe destination operand was in a non-writablesegment.XXThe memory operand was not aligned on a 16-byteboundary.XXA page fault resulted from executing the instruction.Invalid opcode, #UDGeneral protection,#GPXPage fault, #PFInstruction ReferenceMOVNTDQ187AMD64 Technology26568—Rev.
3.09—July 2007MOVNTPDMove Non-Temporal Packed Double-PrecisionFloating-PointStores two double-precision floating-point XMM register values into a 128-bit memory location. Thisinstruction indicates to the processor that the data is non-temporal, and is unlikely to be used againsoon. The processor treats the store as a write-combining (WC) memory write, which minimizes cachepollution. The exact method by which cache pollution is minimized depends on the hardwareimplementation of the instruction.
For further information, see “Memory Optimization” in Volume 1.The MOVNTPD instruction is an SSE2 instruction. The presence of this instruction set is indicated bya CPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodeMOVNTPD mem128, xmm66 0F 2B /rDescriptionStores two packed double-precision floating-point XMMregister values into a 128-bit memory location, minimizingcache pollution.mem128127xmm64 63012764 63copy0copymovntpd.epsMOVNTPD is weakly-ordered with respect to other instructions that operate on memory. Softwareshould use an SFENCE instruction to force strong memory ordering of MOVNTPD with respect toother stores.Related InstructionsMOVNTDQ, MOVNTI, MOVNTPS, MOVNTQrFLAGS AffectedNoneMXCSR Flags AffectedNone188MOVNTPDInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE2 instructions are not supported, as indicatedby EDX bit 26 of CPUID function 0000_0001h.XXXThe emulate bit (CR0.EM) was set to 1.XXXThe operating-system FXSAVE/FXRSTOR support bit(CR4.OSFXSR) was cleared to 0.Device not available,#NMXXXThe task-switch bit (CR0.TS) was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XThe destination operand was in a non-writablesegment.XXThe memory operand was not aligned on a 16-byteboundary.XXA page fault resulted from executing the instruction.Invalid opcode, #UDGeneral protection,#GPXPage fault, #PFInstruction ReferenceMOVNTPD189AMD64 Technology26568—Rev.
3.09—July 2007MOVNTPSMove Non-Temporal PackedSingle-Precision Floating-PointStores four single-precision floating-point XMM register values into a 128-bit memory location. Thisinstruction indicates to the processor that the data is non-temporal, and is unlikely to be used againsoon. The processor treats the store as a write-combining (WC) memory write, which minimizes cachepollution. The exact method by which cache pollution is minimized depends on the hardwareimplementation of the instruction.
For further information, see “Memory Optimization” in Volume 1.The MOVNTPS instruction is an SSE instruction. The presence of this instruction set is indicated by aCPUID feature bit. (See “CPUID” in Volume 3.)MnemonicOpcodeMOVNTPS mem128, xmmDescriptionStores four packed single-precision floating-point XMMregister values into a 128-bit memory location, minimizingcache pollution.0F 2B /rmem12812796 9564 63xmm32 31012796 9564 6332 310copycopycopycopymovntps.epsMOVNTPD is weakly-ordered with respect to other instructions that operate on memory. Softwareshould use an SFENCE instruction to force strong memory ordering of MOVNTPD with respect toother stores.Related InstructionsMOVNTDQ, MOVNTI, MOVNTPD, MOVNTQrFLAGS AffectedNoneMXCSR Flags AffectedNone190MOVNTPSInstruction Reference26568—Rev. 3.09—July 2007AMD64 TechnologyExceptionsExceptionRealVirtual8086 ProtectedCause of ExceptionXXXThe SSE instructions are not supported, as indicatedby EDX bit 25 of CPUID function 0000_0001h.XXXThe emulate bit (CR0.EM) was set to 1.XXXThe operating-system FXSAVE/FXRSTOR support bit(CR4.OSFXSR) was cleared to 0.Device not available,#NMXXXThe task-switch bit (CR0.TS) was set to 1.Stack, #SSXXXA memory address exceeded the stack segment limitor was non-canonical.XXXA memory address exceeded a data segment limit orwas non-canonical.XA null data segment was used to reference memory.XThe destination operand was in a non-writablesegment.XXThe memory operand was not aligned on a 16-byteboundary.XXA page fault resulted from executing the instruction.Invalid opcode, #UDGeneral protection,#GPXPage fault, #PFInstruction ReferenceMOVNTPS191AMD64 Technology26568—Rev.
3.09—July 2007MOVNTSDMove Non-Temporal ScalarDouble-Precision Floating-PointStores one double-precision floating-point XMM register value into a 64-bit memory location. Thisinstruction indicates to the processor that the data is non-temporal, and is unlikely to be used againsoon.