Volume 1 Application Programming (794095), страница 23
Текст из файла (страница 23)
By contrast, memory-mapped I/O uses the mainmemory address space and is accessed using the MOV instructions rather than the I/O instructions.General-Purpose Programming63AMD64 Technology24592—Rev. 3.13—July 2007When operating in legacy protected mode or in long mode, the RFLAGS register’s I/O privilege level(IOPL) field and the I/O-permission bitmap in the current task-state segment (TSS) are used to controlaccess to the I/O addresses (called I/O ports). See “Input/Output” on page 90 for further information.General I/O• IN—Input from Port• OUT—Output to PortThe IN instruction reads a byte, word, or doubleword from the I/O port address specified by the sourceoperand, and loads it into the accumulator register (AL or eAX). The source operand can be animmediate byte or the DX register.The OUT instruction writes a byte, word, or doubleword from the accumulator register (AL or eAX) tothe I/O port address specified by the destination operand, which can be either an immediate byte or theDX register.If the I/O port address is specified with an immediate operand, the range of port addresses accessibleby the IN and OUT instructions is limited to ports 0 through 255.
If the I/O port address is specified bya value in the DX register, all 65,536 ports are accessible.String I/O• INS—Input String• INSB—Input String Byte• INSW—Input String Word• INSD—Input String Doubleword• OUTS—Output String• OUTSB—Output String Byte• OUTSW—Output String Word• OUTSD—Output String DoublewordThe INSx instructions (INSB, INSW, INSD) read a byte, word, or doubleword from the I/O portspecified by the DX register, and load it into the memory location specified by ES:[rDI].The OUTSx instructions (OUTSB, OUTSW, OUTSD) write a byte, word, or doubleword from animplicit memory location specified by seg:[rSI], to the I/O port address stored in the DX register.The INSx and OUTSx instructions are commonly used with a repeat prefix to transfer blocks of data.The memory pointer address is not incremented or decremented.
This usage is intended for peripheralI/O devices that are expecting a stream of data.3.3.14 SemaphoresThe semaphore instructions support the implementation of reliable signaling between processors in amulti-processing environment, usually for the purpose of sharing resources.64General-Purpose Programming24592—Rev.
3.13—July 2007•••••AMD64 TechnologyCMPXCHG—Compare and ExchangeCMPXCHG8B—Compare and Exchange Eight BytesCMPXCHG16B—Compare and Exchange Sixteen BytesXADD—Exchange and AddXCHG—ExchangeThe CMPXCHG instruction compares a value in the AL or rAX register with the first (destination)operand, and sets the arithmetic flags (ZF, OF, SF, AF, CF, PF) according to the result. If the comparedvalues are equal, the source operand is loaded into the destination operand. If they are not equal, thefirst operand is loaded into the accumulator.
CMPXCHG can be used to try to intercept a semaphore,i.e. test if its state is free, and if so, load a new value into the semaphore, making its state busy. The testand load are performed atomically, so that concurrent processes or threads which use the semaphore toaccess a shared object will not conflict.The CMPXCHG8B instruction compares the 64-bit values in the EDX:EAX registers with a 64-bitmemory location. If the values are equal, the zero flag (ZF) is set, and the ECX:EBX value is copied tothe memory location. Otherwise, the ZF flag is cleared, and the memory value is copied to EDX:EAX.The CMPXCHG16B instruction compares the 128-bit value in the RDX:RAX and RCX:RBXregisters with a 128-bit memory location.
If the values are equal, the zero flag (ZF) is set, and theRCX:RBX value is copied to the memory location. Otherwise, the ZF flag is cleared, and the memoryvalue is copied to rDX:rAX.The XADD instruction exchanges the values of its two operands, then it stores their sum in the first(destination) operand.A LOCK prefix can be used to make the CMPXCHG, CMPXCHG8B and XADD instructions atomicif one of the operands is a memory location.The XCHG instruction exchanges the values of its two operands.
If one of the operands is in memory,the processor’s bus-locking mechanism is engaged automatically during the exchange, whether or notthe LOCK prefix is used.3.3.15 Processor Information• CPUID—Processor IdentificationThe CPUID instruction returns information about the processor implementation and its support forinstruction subsets and architectural features. Software operating at any privilege level can execute theCPUID instruction to read this information.
After the information is read, software can selectprocedures that optimize performance for a particular hardware implementation.Some processor implementations may not support the CPUID instruction. Support for the CPUIDinstruction is determined by testing the RFLAGS.ID bit. If software can write this bit, then the CPUIDinstruction is supported by the processor implementation. Otherwise, execution of CPUID results in aninvalid-opcode exception.General-Purpose Programming65AMD64 Technology24592—Rev. 3.13—July 2007See “Feature Detection” on page 74 for details about using the CPUID instruction.
For a fulldescription of the CPUID instruction and its function codes, see “CPUID” in Volume 3 and the AMDCPUID Specification, order# 25481.3.3.16 Cache and Memory ManagementApplications can use the cache and memory-management instructions to control memory reads andwrites to influence the caching of read/write data. “Memory Optimization” on page 92 describes howthese instructions interact with the memory subsystem.••••LFENCE—Load FenceSFENCE—Store FenceMFENCE—Memory FencePREFETCHlevel—Prefetch Data to Cache Level level•••PREFETCH—Prefetch L1 Data-Cache LinePREFETCHW—Prefetch L1 Data-Cache Line for WriteCLFLUSH—Cache Line InvalidateThe LFENCE, SFENCE, and MFENCE instructions can be used to force ordering on memoryaccesses. The order of memory accesses can be important when the reads and writes are to a memorymapped I/O device, and in multiprocessor environments where memory synchronization is required.LFENCE affects ordering on memory reads, but not writes.
SFENCE affects ordering on memorywrites, but not reads. MFENCE orders both memory reads and writes. These instructions do not takeoperands. They are simply inserted between the memory references that are to be ordered. For detailsabout the fence instructions, see “Forcing Memory Order” on page 94.The PREFETCHlevel, PREFETCH, and PREFETCHW instructions load data from memory into oneor more cache levels. PREFETCHlevel loads a memory block into a specified level in the data-cachehierarchy (including a non-temporal caching level). The size of the memory block is implementationdependent. PREFETCH loads a cache line into the L1 data cache. PREFETCHW loads a cache lineinto the L1 data cache and sets the cache line’s memory-coherency state to modified, in anticipation ofsubsequent data writes to that line.
(Both PREFETCH and PREFETCHW are 3DNow!™instructions.) For details about the prefetch instructions, see “Cache-Control Instructions” on page 99.For a description of MOESI memory-coherency states, see “Memory System” in Volume 2.The CLFLUSH instruction writes unsaved data back to memory for the specified cache line from allprocessor caches, invalidates the specified cache, and causes the processor to send a bus cycle whichsignals external caching devices to write back and invalidate their copies of the cache line. CLFLUSHprovides a finer-grained mechanism than the WBINVD instruction, which writes back and invalidatesall cache lines. Moreover, CLFLUSH can be used at all privilege levels, unlike WBINVD which can beused only by system software running at privilege level 0.3.3.17 No Operation• NOP—No Operation66General-Purpose Programming24592—Rev.
3.13—July 2007AMD64 TechnologyThe NOP instructions performs no operation (except incrementing the instruction pointer rIP by one).It is an alternative mnemonic for the XCHG rAX, rAX instruction. Depending on the hardwareimplementation, the NOP instruction may use one or more cycles of processor time.3.3.18 System CallsSystem Call and Return••••SYSENTER—System CallSYSEXIT—System ReturnSYSCALL—Fast System CallSYSRET—Fast System ReturnThe SYSENTER and SYSCALL instructions perform a call to a routine running at current privilegelevel (CPL) 0—for example, a kernel procedure—from a user level program (CPL 3). The addresses ofthe target procedure and (for SYSENTER) the target stack are specified implicitly through the modelspecific registers (MSRs). Control returns from the operating system to the caller when the operatingsystem executes a SYSEXIT or SYSRET instruction.
SYSEXIT are SYSRET are privilegedinstructions and thus can be issued only by a privilege-level-0 procedure.The SYSENTER and SYSEXIT instructions form a complementary pair, as do SYSCALL andSYSRET. SYSENTER and SYSEXIT are invalid in 64-bit mode. In this case, use the fasterSYSCALL and SYSRET instructions.For details on these on other system-related instructions, see “System-Management Instructions” inVolume 2 and “System Instruction Reference” in Volume 3.3.4General Rules for Instructions in 64-Bit ModeThis section provides details of the general-purpose instructions in 64-bit mode, and how they differfrom the same instructions in legacy and compatibility modes. The differences apply only to generalpurpose instructions.