Volume 1 Basic Architecture (794100), страница 79
Текст из файла (страница 79)
To optimize performance, the processor allows cacheable memory reads to be reordered ahead of buffered writes in most situations. Internally, processor reads (cache hits) can bereordered around buffered writes. When using memory-mapped I/O, therefore, ispossible that an I/O read might be performed before the memory write of a previousinstruction. The recommended method of enforcing program ordering of memorymapped I/O accesses with the Pentium 4, Intel Xeon, and P6 family processors is touse the MTRRs to make the memory mapped I/O address space uncacheable; for thePentium and Intel486 processors, either the #KEN pin or the PCD flags can be usedfor this purpose (see Section 13.3.1, “Memory-Mapped I/O”).When the target of a read or write is in an uncacheable region of memory, memoryreordering does not occur externally at the processor’s pins (that is, reads and writesappear in-order).
Designating a memory mapped I/O region of the address space asuncacheable insures that reads and writes of I/O devices are carried out in programorder. See Chapter 10, “Memory Cache Control,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for more information on usingMTRRs.Another method of enforcing program order is to insert one of the serializing instructions, such as the CPUID instruction, between operations. See Chapter 7, “MultipleProcessor Management,” in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, for more information on serialization of instructions.It should be noted that the chip set being used to support the processor (buscontroller, memory controller, and/or I/O controller) may post writes to uncacheablememory which can lead to out-of-order execution of memory accesses.
In situationswhere out-of-order processing of memory accesses by the chip set can potentiallycause faulty memory-mapped I/O processing, code must be written to force synchronization and ordering of I/O operations. Serializing instructions can often be used forthis purpose.When the I/O address space is used instead of memory-mapped I/O, the situation isdifferent in two respects:Vol. 1 13-7INPUT/OUTPUT•The processor never buffers I/O writes.
Therefore, strict ordering of I/Ooperations is enforced by the processor. (As with memory-mapped I/O, it ispossible for a chip set to post writes in certain I/O ranges.)•The processor synchronizes I/O instruction execution with external bus activity(see Table 13-1).Table 13-1. I/O Instruction SerializationProcessor Delays Execution of …Instruction Being CurrentExecutedInstruction?NextInstruction?Until Completion of …Pending Stores?INYesYesINSYesYesREP INSYesYesCurrent Store?OUTYesYesYesOUTSYesYesYesREP OUTSYesYesYes13-8 Vol.
1CHAPTER 14PROCESSOR IDENTIFICATION ANDFEATURE DETERMINATIONWhen writing software intended to run on IA-32 processors, it is necessary to identifythe type of processor present in a system and the processor features that are available to an application.14.1USING THE CPUID INSTRUCTIONUse the CPUID instruction for processor identification in the Pentium M processorfamily, Pentium 4 processor family, Intel Xeon processor family, P6 family, Pentiumprocessor, and later Intel486 processors. This instruction returns the family, modeland (for some processors) a brand string for the processor that executes the instruction. It also indicates the features that are present in the processor and give information about the processors caches and TLB.The ID flag (bit 21) in the EFLAGS register indicates support for the CPUID instruction.
If a software procedure can set and clear this flag, the processor executing theprocedure supports the CPUID instruction. The CPUID instruction will cause theinvalid opcode exception (#UD) if executed on a processor that does not support it.To obtain processor identification information, a source operand value is placed in theEAX register to select the type of information to be returned.
When the CPUIDinstruction is executed, selected information is returned in the EAX, EBX, ECX, andEDX registers. For a complete description of the CPUID instruction, tables indicatingvalues returned, and example code, see “CPUID—CPUID Identification” in Chapter 3of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A.14.1.1Notes on Where to StartFor detailed application notes on the instruction, see AP-485, Intel Processor Identification and the CPUID Instruction (Order Number 241618). This publication providesadditional information and example source code for use in identifying IA-32 processors.
It also contains guidelines for using the CPUID instruction to help maintain thewidest range of software compatibility. The following guidelines are among the mostimportant, and should always be followed when using the CPUID instruction to determine available features:•Always begin by testing for the “GenuineIntel,” message in the EBX, EDX, andECX registers when the CPUID instruction is executed with EAX equal to 0. If theprocessor is not genuine Intel, the feature identification flags may have differentmeanings than are described in Intel documentation.Vol. 1 14-1PROCESSOR IDENTIFICATION AND FEATURE DETERMINATION•Test feature identification flags individually and do not make assumptions aboutundefined bits.14.1.2Identification of Earlier IA-32 ProcessorsThe CPUID instruction is not available in earlier IA-32 processors up through theearlier Intel486 processors.
For these processors, several other architecturalfeatures can be exploited to identify the processor.The settings of bits 12 and 13 (IOPL), 14 (NT), and 15 (reserved) in the EFLAGSregister are different for Intel’s 32-bit processors than for the Intel 8086 and Intel286 processors. By examining the settings of these bits (with the PUSHF/PUSHFDand POP/POPFD instructions), an application program can determine whether theprocessor is an 8086, Intel 286, or one of the Intel 32-bit processors:•••8086 processor — Bits 12 through 15 of the EFLAGS register are always set.Intel 286 processor — Bits 12 through 15 are always clear in real-address mode.32-bit processors — In real-address mode, bit 15 is always clear and bits 12through 14 have the last value loaded into them. In protected mode, bit 15 isalways clear, bit 14 has the last value loaded into it, and the IOPL bits depends onthe current privilege level (CPL).
The IOPL field can be changed only if the CPLis 0.Other EFLAG register bits that can be used to differentiate between the 32-bitprocessors:•Bit 18 (AC) — Implemented only on the Pentium 4, Intel Xeon, P6 family,Pentium, and Intel486 processors. The inability to set or clear this bit distinguishes an Intel386 processor from the later IA-32 processors.•Bit 21 (ID) — Determines if the processor is able to execute the CPUIDinstruction. The ability to set and clear this bit indicates that it is a Pentium 4,Intel Xeon, P6 family, Pentium, or later-version Intel486 processor.To determine whether an x87 FPU or NPX is present in a system, applications canwrite to the x87 FPU status and control registers using the FNINIT instruction andthen verify that the correct values are read back using the FNSTENV instruction.After determining that an x87 FPU or NPX is present, its type can then be determined.
In most cases, the processor type will determine the type of FPU or NPX;however, an Intel386 processor is compatible with either an Intel 287 or Intel 387math coprocessor.The method the coprocessor uses to represent ∞ (after the execution of the FINIT,FNINIT, or RESET instruction) indicates which coprocessor is present. The Intel 287math coprocessor uses the same bit representation for +∞ and −∞; whereas, theIntel 387 math coprocessor uses different representations for +∞ and −∞.14-2 Vol. 1APPENDIX AEFLAGS CROSS-REFERENCEA.1EFLAGS AND INSTRUCTIONSTable A-2 summarizes how the instructions affect the flags in the EFLAGS register.The following codes describe how the flags are affected.Table A-1.
Codes Describing FlagsTInstruction tests flag.MInstruction modifies flag (either sets or resets depending on operands).0Instruction resets flag.1Instruction sets flag.—Instruction's effect on flag is undefined.RInstruction restores prior value of flag.BlankInstruction does not affect flag.Table A-2. EFLAGS Cross-ReferenceInstructionOFSFZFAAA———AAD—MAAM—AASPFCFTM—MM—M—MM—M————TM—MADCMMMMMTMADDMMMMMMAND0MM—M0ARPLAFTFIFDFNTRFMBOUNDBSF/BSR——M————————MBSWAPBT/BTS/BTR/BTCCALLVol. 1 A-1EFLAGS CROSS-REFERENCETable A-2.
EFLAGS Cross-Reference (Contd.)InstructionOFSFZFAFPFCFTFIFDFNTCBWCLC0CLD0CLI0CLTSCMCMCMOVccTTTCMPMMMCMPSMMCMPXCHGMMCMPXCHG8BTTMMMMMMMMMMMTMCOMSID00M0MMCOMISS00M0MMDAA—MMTMMTMDAS—MMTMMTMCPUIDCWDDECMMMMMDIV——————ENTERESCFCMOVccTTTFCOMI, FCOMIP, FUCOMI,FUCOMIPMMMHLTIDIV——————IMULM————MMMMMMININCINSINTA-2 Vol. 1T00RFEFLAGS CROSS-REFERENCETable A-2. EFLAGS Cross-Reference (Contd.)InstructionINTOOFSFZFAFPFCFTTFIFDF0NTRF0INVDINVLPGUCOMSID00M0MMUCOMISS00M0MMIRETRRRRRRJccTTTTTRRRTJCXZJMPLAHFLARMLDS/LES/LSS/LFS/LGSLEALEAVELGDT/LIDT/LLDT/LMSWLOCKLODSTLOOPLOOPE/LOOPNETLSLMLTRMONITORMWAITMOVMOV control, debug, test——————MOVSTMOVSX/MOVZXMULM————MNEGMMMMMMNOPNOTVol.
1 A-3EFLAGS CROSS-REFERENCETable A-2. EFLAGS Cross-Reference (Contd.)InstructionOROFSFZFAFPFCF0MM—M0TFIFDFNTRFOUTOUTSTPOP/POPAPOPFRRRRRRRRRRMMMMPUSH/PUSHA/PUSHFRCL/RCR 1MTMRCL/RCR count—TMROL/ROR 1MMROL/ROR count—RSMMRDMSRRDPMCRDTSCREP/REPE/REPNERETSAHFMMMMMMRRRRRSAL/SAR/SHL/SHR 1MMM—MMSAL/SAR/SHL/SHRcount—MM—MMSBBMMMMMTMSCASMMMMMMSETccTTTTT—MMMMTSGDT/SIDT/SLDT/SMSWSHLD/SHRD—STC1STD1STI1STOSTSTRSUBA-4 Vol. 1MMMMMMMEFLAGS CROSS-REFERENCETable A-2.