Volume 2 System Programming (794096), страница 55
Текст из файла (страница 55)
Two mechanisms are provided for software to control access to and cacheability ofspecific memory regions:••The memory-type range registers (MTRRs) control cacheability based on physical addresses. See“MTRRs” on page 182 for more information on the use of MTRRs.The page-attribute table (PAT) mechanism controls cacheability based on virtual addresses. PATextends the capabilities provided by the PCD and PWT page-level cache controls. See “PageAttribute Table Mechanism” on page 191 for more information on the use of the PAT mechanism.System software can combine the use of both the MTRRs and PAT mechanisms to maximize controlover memory cacheability.If the MTRRs are disabled in implementations that support the MTRR mechanism, the defaultmemory type is set to uncacheable (UC).
Memory accesses are not cached even if the caches areenabled by clearing CR0.CD to 0. Cacheable memory types must be established using the MTRRs inorder for memory accesses to be cached.Cache Control Precedence. The cache-control mechanisms are used to define the memory type andcacheability of main memory and regions of main memory. Taken together, the most restrictivememory type takes precedence in defining the caching policy of memory.
The order of precedence is:1. Uncacheable (UC)2. Write-combining (WC)3. Write-protected (WP)4. Writethrough (WT)5. Writeback (WB)For example, assume a large memory region is designated a writethrough type using the MTRRs.Individual pages within that region can have caching disabled by setting the appropriate page-table178Memory System24593—Rev. 3.13—July 2007AMD64 TechnologyPCD bits. However, no pages within that region can have a writeback caching policy, regardless of thepage-table PWT values.7.6.3 Cache and Memory Management InstructionsData Prefetch. The prefetch instructions are used by software as a hint to the processor that thereferenced data is likely to be used in the near future.
The processor can preload the cache linecontaining the data in anticipation of its use. PREFETCH provides a hint that the data is to be read.PREFETCHW provides a hint that the data is to be written. The processor can mark the line asmodified if it is preloaded using PREFETCHW.Memory Ordering.
Instructions are provided for software to enforce memory ordering (serialization)in weakly-ordered memory types. These instructions are:•••SFENCE (store fence)—forces all memory writes (stores) preceding the SFENCE (in programorder) to be written into memory before memory writes following the SFENCE.LFENCE (load fence)—forces all memory reads (loads) preceding the LFENCE (in programorder) to be read from memory before memory reads following the LFENCE.MFENCE (memory fence)—forces all memory accesses (reads and writes) preceding theMFENCE (in program order) to be written into or read from memory before memory accessesfollowing the MFENCE.Cache Line Flush. The CLFLUSH instruction (writeback, if modified, and invalidate) takes the bytememory-address operand (a linear address), and checks to see if the address is cached.
If the address iscached, the entire cache line containing the address is invalidated. If any portion of the cache line isdirty (in the modified or owned state), the entire line is written to main memory before it is invalidated.CLFLUSH affects all caches in the memory hierarchy—internal and external to the processor.
Thechecking and invalidation process continues until the address has been invalidated in all caches.In most cases, the underlying memory type assigned to the address has no effect on the behavior of thisinstruction. However, when the underlying memory type for the address is UC or WC (as defined bythe MTRRs), the processor does not proceed with checking all caches to see if the address is cached. Inboth cases, the address is uncacheable, and invalidation is unnecessary.
Write-combining buffers arewritten back to memory if the corresponding physical address falls within the buffer active-addressrange.Cache Writeback and Invalidate. Unlike the CLFLUSH instruction, the WBINVD instructionoperates on the entire cache, rather than a single cache line. The WBINVD instruction first writes backall cache lines that are dirty (in the modified or owned state) to main memory. After writeback iscomplete, the instruction invalidates all cache lines. The checking and invalidation process continuesuntil all internal caches are invalidated. A special bus cycle is transmitted to higher-level externalcaches directing them to perform a writeback-and-invalidate operation.Memory System179AMD64 Technology24593—Rev.
3.13—July 2007Cache Invalidate. The INVD instruction is used to invalidate all cache lines. Unlike the WBINVDinstruction, dirty cache lines are not written to main memory. The process continues until all internalcaches have been invalidated. A special bus cycle is transmitted to higher-level external cachesdirecting them to perform an invalidation.The INVD instruction should only be used in situations where memory coherency is not required.7.6.4 Serializing InstructionsSerializing instructions force the processor to retire the serializing instruction and all previousinstructions before the next instruction is fetched. A serializing instruction is retired when thefollowing operations are complete:••••The instruction has executed.All registers modified by the instruction are updated.All memory updates performed by the instruction are complete.All data held in the write buffers have been written to memory.Serializing instructions can be used as a barrier between memory accesses to force strong ordering ofmemory operations.
Care should be exercised in using serializing instructions because they modifyprocessor state and affect program flow. The instructions also force execution serialization, which cansignificantly degrade performance. When strongly-ordered memory accesses are required, butexecution serialization is not, it is recommended that software use the memory-ordering instructionsdescribed on page 179.The following are serializing instructions:••Non-Privileged Instructions- CPUID- IRET- RSMPrivileged Instructions- MOV CRn- MOV DRn- LGDT, LIDT, LLDT, LTR- SWAPGS- WRMSR- WBINVD, INVD- INVLPG180Memory System24593—Rev.
3.13—July 20077.7AMD64 TechnologyMemory-Type Range RegistersThe AMD64 architecture supports three mechanisms for software access-control and cacheabilitycontrol over memory regions. These mechanisms can be used in place of similar capabilities providedby external chipsets used with early x86 processors.This section describes a control mechanism that uses a set of programmable model-specific registers(MSRs) called the memory-type-range registers (MTRRs).
The MTRR mechanism provides systemsoftware with the ability to manage hardware-device memory mapping. System software cancharacterize physical-memory regions by type (e.g., ROM, flash, memory-mapped I/O) and assignhardware devices to the appropriate physical-memory type.Another control mechanism is implemented as an extension to the page-translation capability and iscalled the page attribute table (PAT). It is described in “Page-Attribute Table Mechanism” onpage 191.
Like the MTRRs, PAT provides system software with the ability to manage hardware-devicememory mapping. With PAT, however, system software can characterize physical pages and assignvirtually-mapped devices to those physical pages using the page-translation mechanism. PAT may beused in conjunction with the MTTR mechanism to maximize flexibility in memory control.Finally, control mechanisms are provided for managing memory-mapped I/O. These mechanismsemploy extensions to the MTRRs and a separate feature called the top-of-memory registers. TheMTRR extensions include additional MTRR type-field encodings for fixed-range MTRRs andvariable-range I/O range registers (IORRs). These mechanisms are described in “Memory-MappedI/O” on page 195.7.7.1 MTRR Type FieldsThe MTRR mechanism provides a means for associating a physical-address range with a memory type(see “Memory Types” on page 168).
The MTRRs contain a type field used to specify the memory typein effect for a given physical-address range.There are two variants of the memory type-field encodings: standard and extended. Both the standardand extended encodings use type-field bits 2–0 to specify the memory type. For the standardencodings, bits 7–3 are reserved and must be zero. For the extended encodings, bits 7–5 are reserved,but bits 4–3 are defined as the RdMem and WrMem bits. “Extended Fixed-Range MTRR Type-FieldEncodings” on page 195 describes the function of these extended bits and how software enables them.Only the fixed-range MTRRs support the extended type-field encodings. Variable-range MTRRs usethe standard encodings.Table 7-5 on page 182 shows the memory types supported by the MTRR mechanism and theirencoding in the MTRR type fields referenced throughout this section.
Unless the extended type-fieldencodings are explicitly enabled, the processor uses the type values shown in Table 7-5.Memory System181AMD64 TechnologyTable 7-5.24593—Rev. 3.13—July 2007MTRR Type Field EncodingsType ValueType NameType Description00hUC—Uncacheable01hWC—Write-Combining04hWT—WritethroughReads allocate cache lines on a cache miss. Cache lines are notallocated on a write miss. Write hits update the cache and mainmemory.05hWP—Write-ProtectReads allocate cache lines on a cache miss. All writes update mainmemory. Cache lines are not allocated on a write miss.
Write hitsinvalidate the cache line and update main memory.06hWB—WritebackReads allocate cache lines on a cache miss, and can allocate toeither the shared, exclusive, or modified state. Writes allocate to themodified state on a cache miss.All accesses are uncacheable. Write combining is not allowed.Speculative accesses are not allowedAll accesses are uncacheable. Write combining is allowed.Speculative reads are allowedIf the MTRRs are disabled in implementations that support the MTRR mechanism, the defaultmemory type is set to uncacheable (UC).
Memory accesses are not cached even if the caches areenabled by clearing CR0.CD to 0. Cacheable memory types must be established using the MTRRs toenable memory accesses to be cached.7.7.2 MTRRsBoth fixed-size and variable-size address ranges are supported by the MTRR mechanism. The fixedsize ranges are restricted to the lower 1 Mbyte of physical-address space, while the variable-sizeranges can be located anywhere in the physical-address space.Figure 7-4 on page 183 shows an example mapping of physical memory using the fixed-size andvariable-size MTRRs. The areas shaded gray are not mapped by the MTRRs. Unmapped areas are setto the software-selected default memory type.182Memory System24593—Rev. 3.13—July 2007AMD64 TechnologyPhysical Memory0_FFFF_FFFF_FFFFhDefault (Unmapped) RangesUp to 8 Variable Ranges64 4-Kbyte Ranges256 Kbytes16 16-Kbyte Ranges256 Kbytes8 64-Kbyte Ranges512 Kbytes10_0000h0F_FFFFh00_0000hFigure 7-4.513-214.epsMTRR Mapping of Physical MemoryMTRRs are 64-bit model-specific registers (MSRs).