Volume 3 General-Purpose and System Instructions (794097), страница 35
Текст из файла (страница 35)
3.13—July 2007AMD64 TechnologyPREFETCHlevelPrefetch Data to Cache Level levelLoads a cache line from the specified memory address into the data-cache level specified by thelocality reference bits 5–3 of the ModRM byte. Table 3-3 on page 195 lists the locality referenceoptions for the instruction.This instruction loads a cache line even if the mem8 address is not aligned with the start of the line. Ifthe cache line is already contained in a cache level that is lower than the specified locality reference, orif a memory fault is detected, a bus cycle is not initiated and the instruction is treated as a NOP.The operation of this instruction is implementation-dependent.
The processor implementation canignore or change this instruction. The size of the cache line also depends on the implementation, with aminimum size of 32 bytes. AMD processors alias PREFETCH1 and PREFETCH2 to PREFETCH0.For details on the use of this instruction, see the software-optimization documentation relating toparticular hardware implementations.MnemonicOpcodeDescriptionPREFETCHNTA mem80F 18 /0Move data closer to the processor using the NTAreference.PREFETCHT0 mem80F 18 /1Move data closer to the processor using the T0reference.PREFETCHT1 mem80F 18 /2Move data closer to the processor using the T1reference.PREFETCHT2 mem80F 18 /3Move data closer to the processor using the T2reference.Table 3-3.Locality References for the Prefetch InstructionsLocalityReferenceNTADescriptionNon-Temporal Access—Move the specified data into the processor withminimum cache pollution.
This is intended for data that will be used onlyonce, rather than repeatedly. The specific technique for minimizing cachepollution is implementation-dependent and may include such techniquesas allocating space in a software-invisible buffer, allocating a cache line inonly a single way, etc. For details, see the software-optimizationdocumentation for a particular hardware implementation.T0All Cache Levels—Move the specified data into all cache levels.T1Level 2 and Higher—Move the specified data into all cache levels except0th level (L1) cache.T2Level 3 and Higher—Move the specified data into all cache levels except0th level (L1) and 1st level (L2) caches.Related InstructionsPREFETCH, PREFETCHWInstruction ReferencePREFETCHlevel195AMD64 Technology24594—Rev. 3.13—July 2007rFLAGS AffectedNoneExceptionsNone196PREFETCHlevelInstruction Reference24594—Rev.
3.13—July 2007AMD64 TechnologyPUSHPush onto StackDecrements the stack pointer and then copies the specified immediate value or the value in thespecified register or memory location to the top of the stack (the memory location pointed to bySS:rSP).The operand-size attribute determines the number of bytes pushed to the stack. The stack-size attributedetermines whether SP, ESP, or RSP is the stack pointer. The address-size attribute is used only tolocate the memory operand when pushing a memory operand to the stack.If the instruction pushes the stack pointer (rSP), the resulting value on the stack is that of rSP beforeexecution of the instruction.There is a PUSH CS instruction but no corresponding POP CS.
The RET (Far) instruction pops a valuefrom the top of stack into the CS register as part of its operation.In 64-bit mode, the operand size of all PUSH instructions defaults to 64 bits, and there is no prefixavailable to encode a 32-bit operand size. Using the PUSH CS, PUSH DS, PUSH ES, or PUSH SSinstructions in 64-bit mode generates an invalid-opcode exception.Pushing an odd number of 16-bit operands when the stack address-size attribute is 32 results in amisaligned stack pointer.MnemonicOpcodeDescriptionPUSH reg/mem16FF /6Push the contents of a 16-bit register or memoryoperand onto the stack.PUSH reg/mem32FF /6Push the contents of a 32-bit register or memoryoperand onto the stack. (No prefix for encoding this in64-bit mode.)PUSH reg/mem64FF /6Push the contents of a 64-bit register or memoryoperand onto the stack.PUSH reg1650 +rwPush the contents of a 16-bit register onto the stack.PUSH reg3250 +rdPush the contents of a 32-bit register onto the stack.
(Noprefix for encoding this in 64-bit mode.)PUSH reg6450 +rqPush the contents of a 64-bit register onto the stack.PUSH imm86A +ibPush an 8-bit immediate value (sign-extended to 16, 32,or 64 bits) onto the stack.PUSH imm1668 +iwPush a 16-bit immediate value onto the stack.PUSH imm3268 +idPush a 32-bit immediate value onto the stack. (No prefixfor encoding this in 64-bit mode.)PUSH imm6468 +idPush a sign-extended 32-bit immediate value onto thestack.PUSH CS0EPush the CS selector onto the stack. (Invalid in 64-bitmode.)Instruction ReferencePUSH197AMD64 Technology24594—Rev. 3.13—July 2007MnemonicOpcodeDescriptionPUSH SS16Push the SS selector onto the stack. (Invalid in 64-bitmode.)PUSH DS1EPush the DS selector onto the stack.
(Invalid in 64-bitmode.)PUSH ES06Push the ES selector onto the stack. (Invalid in 64-bitmode.)PUSH FS0F A0Push the FS selector onto the stack.PUSH GS0F A8Push the GS selector onto the stack.Related InstructionsPOPrFLAGS AffectedNoneExceptionsExceptionVirtualReal 8086 ProtectedInvalid opcode, #UDCause of ExceptionXPUSH CS, PUSH DS, PUSH ES, or PUSH SS was executedin 64-bit mode.Stack, #SSXXXA memory address exceeded the stack segment limit or wasnon-canonical.General protection,#GPXXXA memory address exceeded a data segment limit or was noncanonical.XA null data segment was used to reference memory.Page fault, #PFXXA page fault resulted from the execution of the instruction.Alignment check,#ACXXAn unaligned memory reference was performed whilealignment checking was enabled.198PUSHInstruction Reference24594—Rev. 3.13—July 2007AMD64 TechnologyPUSHAPUSHADPush All GPRs onto StackPushes the contents of the eAX, eCX, eDX, eBX, eSP (original value), eBP, eSI, and eDI generalpurpose registers onto the stack in that order.
This instruction decrements the stack pointer by 16 or 32depending on operand size.Using the PUSHA or PUSHAD instruction in 64-bit mode generates an invalid-opcode exception.MnemonicOpcodeDescriptionPUSHA60Push the contents of the AX, CX, DX, BX, original SP,BP, SI, and DI registers onto the stack.(Invalid in 64-bit mode.)PUSHAD60Push the contents of the EAX, ECX, EDX, EBX, originalESP, EBP, ESI, and EDI registers onto the stack.(Invalid in 64-bit mode.)Related InstructionsPOPA, POPADrFLAGS AffectedNoneExceptionsExceptionVirtualReal 8086 ProtectedInvalid opcode, #UDCause of ExceptionXThis instruction was executed in 64-bit mode.XXA memory address exceeded the stack segment limit.Page fault, #PFXXA page fault resulted from the execution of the instruction.Alignment check,#ACXXAn unaligned memory reference was performed whilealignment checking was enabled.Stack, #SSXInstruction ReferencePUSHAx199AMD64 Technology24594—Rev.
3.13—July 2007PUSHFPUSHFDPUSHFQPush rFLAGS onto StackDecrements the rSP register and copies the rFLAGS register (except for the VM and RF flags) onto thestack. The instruction clears the VM and RF flags in the rFLAGS image before putting it on the stack.The instruction pushes 2, 4, or 8 bytes, depending on the operand size.In 64-bit mode, this instruction defaults to a 64-bit operand size and there is no prefix available toencode a 32-bit operand size.In virtual-8086 mode, if system software has set the IOPL field to a value less than 3, a generalprotection exception occurs if application software attempts to execute PUSHFx or POPFx while VMEis not enabled or the operand size is not 16-bit.MnemonicOpcodeDescriptionPUSHF9CPush the FLAGS word onto the stack.PUSHFD9CPush the EFLAGS doubleword onto stack. (No prefixencoding this in 64-bit mode.)PUSHFQ9CPush the RFLAGS quadword onto stack.Action// See “Pseudocode Definitions” on page 41.PUSHF_START:IF (REAL_MODE)PUSHF_REALELSIF (PROTECTED_MODE)PUSHF_PROTECTEDELSE // (VIRTUAL_MODE)PUSHF_VIRTUALPUSHF_REAL:PUSH.v old_RFLAGSEXITPUSHF_PROTECTED:PUSH.v old_RFLAGSEXIT// Pushed with RF and VM cleared.// Pushed with RF cleared.PUSHF_VIRTUAL:IF (RFLAGS.IOPL=3){PUSH.v old_RFLAGS // Pushed with RF,VM cleared.EXIT}200PUSHFxInstruction Reference24594—Rev.
3.13—July 2007AMD64 TechnologyELSIF ((CR4.VME=1) && (OPERAND_SIZE=16)){PUSH.v old_RFLAGS // Pushed with VIF in the IF position.// Pushed with IOPL=3.EXIT}ELSE // ((RFLAGS.IOPL<3) && ((CR4.VME=0) || (OPERAND_SIZE!=16)))EXCEPTION [#GP(0)]Related InstructionsPOPF, POPFD, POPFQrFLAGS AffectedNoneExceptionsExceptionStack, #SSVirtualReal 8086 ProtectedXXXCause of ExceptionA memory address exceeded the stack segment limit or wasnon-canonical.General protection,#GPXPage fault, #PFXXA page fault resulted from the execution of the instruction.Alignment check,#ACXXAn unaligned memory reference was performed whilealignment checking was enabled.Instruction ReferenceThe I/O privilege level was less than 3 and either VME was notenabled or the operand size was not 16-bit.PUSHFx201AMD64 Technology24594—Rev.
3.13—July 2007RCLRotate Through Carry LeftRotates the bits of a register or memory location (first operand) to the left (more significant bitpositions) and through the carry flag by the number of bit positions in an unsigned immediate value orthe CL register (second operand). The bits rotated through the carry flag are rotated back in at the rightend (lsb) of the first operand location.The processor masks the upper three bits of the count operand, thus restricting the count to a numberbetween 0 and 31. When the destination is 64 bits wide, the processor masks the upper two bits of thecount, providing a count in the range of 0 to 63.For 1-bit rotates, the instruction sets the OF flag to the exclusive OR of the CF bit (after the rotate) andthe most significant bit of the result.