Volume 3B System Programming Guide_ Part 2 (794104), страница 66
Текст из файла (страница 66)
To support guest real-mode execution, the VMM may establish a simple flat page table for guest linear to host physical address mapping.Memory virtualization algorithms may also need to capture other guest operatingconditions (such as guest performing A20M# address masking) to map the resulting20-bit effective guest physical addresses.26.3.2Guest & Host Physical Address SpacesMemory virtualization provides guest software with contiguous guest physicaladdress space starting zero and extending to the maximum address supported bythe guest virtual processor’s physical address width. The VMM utilizes guest physicalto host physical address mapping to locate all or portions of the guest physicaladdress space in host memory.
The VMM is responsible for the policies and algorithms for this mapping which may take into account the host system physicalmemory map and the virtualized physical memory map exposed to a guest by theVMM. The memory virtualization algorithm needs to accommodate various guestmemory uses (such as: accessing DRAM, accessing memory-mapped registers ofvirtual devices or core logic functions and so forth). For example:•To support guest DRAM access, the VMM needs to map DRAM-backed guestphysical addresses to host-DRAM regions. The VMM also requires the guest tohost memory mapping to be at page granularity.•Virtual devices (I/O devices or platform core logic) emulated by the VMM mayclaim specific regions in the guest physical address space to locate memorymapped registers.
Guest access to these virtual registers may be configured tocause page-fault induced VM-exits by marking these regions as always notVol. 3 26-3VIRTUALIZATION OF SYSTEM RESOURCESpresent. The VMM may handle these VM exits by invoking appropriate virtualdevice emulation code.26.3.3Virtualizing Virtual Memory by Brute ForceVMX provides the hardware features required to fully virtualize guest virtual memoryaccesses. VMX allows the VMM to trap guest accesses to the PAT (Page AttributeTable) MSR and the MTRR (Memory Type Range Registers).
This control allows theVMM to virtualize the specific memory type of a guest memory. The VMM may controlcaching by controlling the guest CR0.CRD and CR0.NW bits, as well as by trappingguest execution of the INVD instruction. The VMM can trap guest CR3 loads andstores, and it may trap guest execution of INVLPG.Because a VMM must retain control of physical memory, it must also retain controlover the processor’s address-translation mechanisms. Specifically, this means thatonly the VMM can access CR3 (which contains the base of the page directory) and canexecute INVLPG (the only other instruction that directly manipulates the TLB).At the same time that the VMM controls address translation, a guest operatingsystem will also expect to perform normal memory management functions.
It willaccess CR3, execute INVLPG, and modify (what it believes to be) page directoriesand page tables. Virtualization of address translation must tolerate and supportguest attempts to control address translation.A simple-minded way to do this would be to ensure that all guest attempts to accessaddress-translation hardware trap to the VMM where such operations can be properlyemulated. It must ensure that accesses to page directories and page tables also gettrapped.
This may be done by protecting these in-memory structures with conventional page-based protection. The VMM can do this because it can locate the pagedirectory because its base address is in CR3 and the VMM receives control on anychange to CR3; it can locate the page tables because their base addresses are in thepage directory.Such a straightforward approach is not necessarily desirable.
Protection of the inmemory translation structures may be cumbersome. The VMM may maintain thesestructures with different values (e.g., different page base addresses) than guest software. This means that there must be traps on guest attempt to read these structuresand that the VMM must maintain, in auxiliary data structures, the values to return tothese reads. There must also be traps on modifications to these structures even if thetranslations they effect are never used.
All this implies considerable overhead thatshould be avoided.26.3.4Alternate Approach to Memory VirtualizationGuest software is allowed to freely modify the guest page-table hierarchy withoutcausing traps to the VMM. Because of this, the active page-table hierarchy might notalways be consistent with the guest hierarchy.
Any potential problems arising from26-4 Vol. 3VIRTUALIZATION OF SYSTEM RESOURCESinconsistencies can be solved using techniques analogous to those used by theprocessor and its TLB.This section describes an alternative approach that allows guest software to freelyaccess page directories and page tables.
Traps occur on CR3 accesses and executionsof INVLPG. They also occur when necessary to ensure that guest modifications to thetranslation structures actually take effect. The software mechanisms to support thisapproach are collectively called virtual TLB. This is because they emulate the functionality of the processor’s physical translation look-aside buffer (TLB).The basic idea behind the virtual TLB is similar to that behind the processor TLB.While the page-table hierarchy defines the relationship between physical to linearaddress, it does not directly control the address translation of each memory access.Instead, translation is controlled by the TLB, which is occasionally filled by theprocessor with translations derived from the page-table hierarchy.
With a virtual TLB,the page-table hierarchy established by guest software (specifically, the guest operating system) does not control translation, either directly or indirectly. Instead,translation is controlled by the processor (through its TLB) and by the VMM (througha page-table hierarchy that it maintains).Specifically, the VMM maintains an alternative page-table hierarchy that effectivelycaches translations derived from the hierarchy maintained by guest software.
Theremainder of this document refers to the former as the active page-table hierarchy(because it is referenced by CR3 and may be used by the processor to load its TLB)and the latter as the guest page-table hierarchy (because it is maintained by guestsoftware). The entries in the active hierarchy may resemble the correspondingentries in the guest hierarchy in some ways and may differ in others.Guest software is allowed to freely modify the guest page-table hierarchy withoutcausing VM exits to the VMM. Because of this, the active page-table hierarchy mightnot always be consistent with the guest hierarchy. Any potential problems arisingfrom any inconsistencies can be solved using techniques analogous to those used bythe processor and its TLB.
Note the following:•Suppose the guest page-table hierarchy allows more access than active hierarchy(for example: there is a translation for a linear address in the guest hierarchy butnot in the active hierarchy); this is analogous to a situation in which the TLBallows less access than the page-table hierarchy. If an access occurs that wouldbe allowed by the guest hierarchy but not the active one, a page fault occurs; thisis analogous to a TLB miss. The VMM gains control (as it handles all page faults)and can update the active page-table hierarchy appropriately; this correspondsto a TLB fill.•Suppose the guest page-table hierarchy allows less access than the activehierarchy; this is analogous to a situation in which the TLB allows more accessthan the page-table hierarchy.
This situation can occur only if the guest operatingsystem has modified a page-table entry to reduce access (for example: bymarking it not-present). Because the older, more permissive translation mayhave been cached in the TLB, the processor is architecturally permitted to use theolder translation and allow more access.
Thus, the VMM may (through the activepage-table hierarchy) also allow greater access. For the new, less permissiveVol. 3 26-5VIRTUALIZATION OF SYSTEM RESOURCEStranslation to take effect, guest software should flush any older translations fromthe TLB either by executing INVLPG or by loading CR3.
Because both theseoperations will cause a trap to the VMM, the VMM will gain control and canremove from the active page-table hierarchy the translations indicated by guestsoftware (the translation of a specific linear address for INVLPG or all translationsfor a load of CR3).As noted previously, the processor reads the page-table hierarchy to cache translations in the TLB. It also writes to the hierarchy to main the accessed (A) and dirty (D)bits in the PDEs and PTEs.
The virtual TLB emulates this behavior as follows:•When a page is accessed by guest software, the A bit in the corresponding PTE(or PDE for a 4-MByte page) in the active page-table hierarchy will be set by theprocessor (the same is true for PDEs when active page tables are accessed by theprocessor). For guest software to operate properly, the VMM should update the Abit in the guest entry at this time. It can do this reliably if it keeps the active PTE(or PDE) marked not-present until it has set the A bit in the guest entry.•When a page is written by guest software, the D bit in the corresponding PTE (orPDE for a 4-MByte page) in the active page-table hierarchy will be set by theprocessor.
For guest software to operate properly, the VMM should update the Dbit in the guest entry at this time. It can do this reliably if it keeps the active PTE(or PDE) marked read-only until it has set the D bit in the guest entry. Thissolution is valid for guest software running at privilege level 3; support for moreprivileged guest software is described in Section 26.3.5.26.3.5Details of Virtual TLB OperationThis section describes in more detail how a VMM could support a virtual TLB. Itexplains how an active page-table hierarchy is initialized and how it is maintained inresponse to page faults, uses of INVLPG, and accesses to CR3. The mechanismsdescribed here are the minimum necessary.
They may not result in the best performance.26-6 Vol. 3VIRTUALIZATION OF SYSTEM RESOURCES"Virtual TLB"Active Page-Table HierarchyGuest Page-Table HierarchyGuestActiveFCR3PTTLBrefill onTLB missset dirtyaccessedFPDFFCR3refill onpage faultset accessedand dirty bitsPTPTINVLPGMOV toCR3task switchFPDFPTFFINVLPGMOV to CR3task switchPD = page directoryPT = page tableF = page frameOM19040Figure 26-1.