Volume 2A Instruction Set Reference A-M (794101), страница 100
Текст из файла (страница 100)
2ALTR—Load Task RegisterINSTRUCTION SET REFERENCE, A-M#GP(selector)If the source selector points to a segment that is not a TSS or toone for a task that is already busy.If the selector points to LDT or is beyond the GDT limit.If the descriptor type of the upper 8-byte of the 16-bytedescriptor is non-zero.#NP(selector)If the TSS is marked not present.#PF(fault-code)If a page fault occurs.#UDIf the LOCK prefix is used.LTR—Load Task RegisterVol.
2A 3-629INSTRUCTION SET REFERENCE, A-MMASKMOVDQU—Store Selected Bytes of Double QuadwordOpcodeInstructionOp/En64-BitModeCompat/ DescriptionLeg Mode66 0F F7 /rMASKMOVDQUxmm1, xmm2AValidValidSelectively write bytes fromxmm1 to memory locationusing the byte mask inxmm2. The default memorylocation is specified byDS:EDI.Instruction Operand EncodingOp/EnOperand 1Operand 2Operand 3Operand 4AModRM:reg (r)ModRM:r/m (r)NANADescriptionStores selected bytes from the source operand (first operand) into an 128-bitmemory location.
The mask operand (second operand) selects which bytes from thesource operand are written to memory. The source and mask operands are XMMregisters. The location of the first byte of the memory location is specified by DI/EDIand DS registers. The memory location does not need to be aligned on a naturalboundary. (The size of the store address depends on the address-size attribute.)The most significant bit in each byte of the mask operand determines whether thecorresponding byte in the source operand is written to the corresponding byte location in memory: 0 indicates no write and 1 indicates write.The MASKMOVDQU instruction generates a non-temporal hint to the processor tominimize cache pollution.
The non-temporal hint is implemented by using a writecombining (WC) memory type protocol (see “Caching of Temporal vs. Non-TemporalData” in Chapter 10, of the Intel® 64 and IA-32 Architectures Software Developer’sManual, Volume 1). Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MASKMOVDQU instructions if multipleprocessors might use different memory types to read/write the destination memorylocations.Behavior with a mask of all 0s is as follows:••No data will be written to memory.•Exceptions associated with addressing memory and page faults may still besignaled (implementation dependent).Signaling of breakpoints (code or data) is not guaranteed; different processorimplementations may signal or not signal these breakpoints.3-630 Vol.
2AMASKMOVDQU—Store Selected Bytes of Double QuadwordINSTRUCTION SET REFERENCE, A-M•If the destination memory region is mapped as UC or WP, enforcement ofassociated semantics for these memory types is not guaranteed (that is, isreserved) and is implementation-specific.The MASKMOVDQU instruction can be used to improve performance of algorithmsthat need to merge data on a byte-by-byte basis. MASKMOVDQU should not cause aread for ownership; doing so generates unnecessary bandwidth since data is to bewritten directly using the byte-mask without allocating old data prior to the store.In 64-bit mode, use of the REX.R prefix permits this instruction to access additionalregisters (XMM8-XMM15).OperationIF (MASK[7] = 1)THEN DEST[DI/EDI] ← SRC[7:0] ELSE (* Memory location unchanged *); FI;IF (MASK[15] = 1)THEN DEST[DI/EDI +1] ← SRC[15:8] ELSE (* Memory location unchanged *); FI;(* Repeat operation for 3rd through 14th bytes in source operand *)IF (MASK[127] = 1)THEN DEST[DI/EDI +15] ← SRC[127:120] ELSE (* Memory location unchanged *); FI;Intel C/C++ Compiler Intrinsic Equivalentvoid _mm_maskmoveu_si128(__m128i d, __m128i n, char * p)Protected Mode Exceptions#GP(0)For an illegal memory operand effective address in the CS, DS,ES, FS or GS segments.
(even if mask is all 0s).If the destination operand is in a nonwritable segment.If the DS, ES, FS, or GS register contains a NULL segmentselector.#SS(0)For an illegal address in the SS segment (even if mask is all 0s).#PF(fault-code)For a page fault (implementation specific).#NMIf CR0.TS[bit 3] = 1.#UDIf CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.Real-Address Mode ExceptionsGPIf any part of the operand lies outside the effective addressspace from 0 to FFFFH.
(even if mask is all 0s).#NMIf CR0.TS[bit 3] = 1.MASKMOVDQU—Store Selected Bytes of Double QuadwordVol. 2A 3-631INSTRUCTION SET REFERENCE, A-M#UDIf CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault (implementation specific).#UDIf the LOCK prefix is used.Compatibility Mode ExceptionsSame exceptions as in protected mode.64-Bit Mode Exceptions#GP(0)If the memory address is in a non-canonical form.#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#PF(fault-code)For a page fault (implementation specific).#NMIf CR0.TS[bit 3] = 1.#UDIf CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE2[bit 26] = 0.If the LOCK prefix is used.3-632 Vol. 2AMASKMOVDQU—Store Selected Bytes of Double QuadwordINSTRUCTION SET REFERENCE, A-MMASKMOVQ—Store Selected Bytes of QuadwordOpcodeInstructionOp/En0F F7 /rMASKMOVQ mm1, Amm264-BitModeCompat/ DescriptionLeg ModeValidValidSelectively write bytes frommm1 to memory locationusing the byte mask in mm2.The default memorylocation is specified byDS:EDI.Instruction Operand EncodingOp/EnOperand 1Operand 2Operand 3Operand 4AModRM:reg (r)ModRM:r/m (r)NANADescriptionStores selected bytes from the source operand (first operand) into a 64-bit memorylocation.
The mask operand (second operand) selects which bytes from the sourceoperand are written to memory. The source and mask operands are MMX technologyregisters. The location of the first byte of the memory location is specified by DI/EDIand DS registers. (The size of the store address depends on the address-sizeattribute.)The most significant bit in each byte of the mask operand determines whether thecorresponding byte in the source operand is written to the corresponding byte location in memory: 0 indicates no write and 1 indicates write.The MASKMOVQ instruction generates a non-temporal hint to the processor to minimize cache pollution.
The non-temporal hint is implemented by using a writecombining (WC) memory type protocol (see “Caching of Temporal vs. Non-TemporalData” in Chapter 10, of the Intel® 64 and IA-32 Architectures Software Developer’sManual, Volume 1). Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with MASKMOVQ instructions if multipleprocessors might use different memory types to read/write the destination memorylocations.This instruction causes a transition from x87 FPU to MMX technology state (that is,the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s[valid]).The behavior of the MASKMOVQ instruction with a mask of all 0s is as follows:•••No data will be written to memory.Transition from x87 FPU to MMX technology state will occur.Exceptions associated with addressing memory and page faults may still besignaled (implementation dependent).MASKMOVQ—Store Selected Bytes of QuadwordVol.
2A 3-633INSTRUCTION SET REFERENCE, A-M•Signaling of breakpoints (code or data) is not guaranteed (implementationdependent).•If the destination memory region is mapped as UC or WP, enforcement ofassociated semantics for these memory types is not guaranteed (that is, isreserved) and is implementation-specific.The MASKMOVQ instruction can be used to improve performance for algorithms thatneed to merge data on a byte-by-byte basis. It should not cause a read for ownership; doing so generates unnecessary bandwidth since data is to be written directlyusing the byte-mask without allocating old data prior to the store.In 64-bit mode, the memory address is specified by DS:RDI.OperationIF (MASK[7] = 1)THEN DEST[DI/EDI] ← SRC[7:0] ELSE (* Memory location unchanged *); FI;IF (MASK[15] = 1)THEN DEST[DI/EDI +1] ← SRC[15:8] ELSE (* Memory location unchanged *); FI;(* Repeat operation for 3rd through 6th bytes in source operand *)IF (MASK[63] = 1)THEN DEST[DI/EDI +15] ← SRC[63:56] ELSE (* Memory location unchanged *); FI;Intel C/C++ Compiler Intrinsic Equivalentvoid _mm_maskmove_si64(__m64d, __m64n, char * p)Protected Mode Exceptions#GP(0)For an illegal memory operand effective address in the CS, DS,ES, FS or GS segments (even if mask is all 0s).If the destination operand is in a nonwritable segment.If the DS, ES, FS, or GS register contains a NULL segmentselector.#SS(0)For an illegal address in the SS segment (even if mask is all 0s).#PF(fault-code)For a page fault (implementation specific).#NMIf CR0.TS[bit 3] = 1.#MFIf there is a pending FPU exception.#UDIf CR0.EM[bit 2] = 1.If CPUID.01H:EDX.SSE[bit 25] = 0.If Mod field of the ModR/M byte not 11B.If the LOCK prefix is used.#AC(0)3-634 Vol.
2AIf alignment checking is enabled and an unaligned memoryreference is made while the current privilege level is 3.MASKMOVQ—Store Selected Bytes of QuadwordINSTRUCTION SET REFERENCE, A-MReal-Address Mode ExceptionsGPIf any part of the operand lies outside the effective addressspace from 0 to FFFFH. (even if mask is all 0s).#NMIf CR0.TS[bit 3] = 1.#MFIf there is a pending FPU exception.#UDIf CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE[bit 25] = 0.If the LOCK prefix is used.Virtual-8086 Mode ExceptionsSame exceptions as in real address mode.#PF(fault-code)For a page fault (implementation specific).#AC(0)If alignment checking is enabled and an unaligned memoryreference is made.Compatibility Mode ExceptionsSame exceptions as in protected mode.64-Bit Mode Exceptions#GP(0)If the memory address is in a non-canonical form.#SS(0)If a memory address referencing the SS segment is in a noncanonical form.#PF(fault-code)For a page fault (implementation specific).#NMIf CR0.TS[bit 3] = 1.#MFIf there is a pending FPU exception.#UDIf CR0.EM[bit 2] = 1.If CR4.OSFXSR[bit 9] = 0.If CPUID.01H:EDX.SSE[bit 25] = 0.If Mod field of the ModR/M byte not 11B.If the LOCK prefix is used.#AC(0)If alignment checking is enabled and an unaligned memoryreference is made while the current privilege level is 3.MASKMOVQ—Store Selected Bytes of QuadwordVol.