Volume 3B System Programming Guide_ Part 2 (794104), страница 67
Текст из файла (страница 67)
Virtual TLB SchemeAs noted above, the VMM maintains an active page-table hierarchy for each virtualmachine that it supports. It also maintains, for each machine, values that themachine expects for control registers CR0, CR2, CR3, and CR4 (they control addresstranslation). These values are called the guest control registers.In general, the VMM selects the physical-address space that is allocated to guestsoftware. The term guest address refers to an address installed by guest software inthe guest CR3, in a guest PDE (as a page table base address or a page base address),or in a guest PTE (as a page base address).
While guest software considers these tobe specific physical addresses, the VMM may map them differently.26.3.5.1Initialization of Virtual TLBTo enable the Virtual TLB scheme, the VMCS must be set up to trigger VM exits on:•All writes to CR3 (the CR3-target count should be 0) or the paging-mode bits inCR0 and CR4 (using the CR0 and CR4 guest/host masks)••Page-fault (#PF) exceptionsExecution of INVLPGVol. 3 26-7VIRTUALIZATION OF SYSTEM RESOURCESWhen guest software first enables paging, the VMM creates an aligned 4-KByte activepage directory that is invalid (all entries marked not-present).
This invalid directoryis analogous to an empty TLB.26.3.5.2Response to Page FaultsPage faults can occur for a variety of reasons. In some cases, the page fault alerts theVMM to an inconsistency between the active and guest page-table hierarchy. In suchcases, the VMM can update the former and re-execute the faulting instruction. Inother cases, the hierarchies are already consistent and the fault should be handledby the guest operating system. The VMM can detect this and use an establishedmechanism for raising a page fault to guest software.The VMM can handle a page fault by following these steps (The steps below assumethe guest is operating in a paging mode without PAE.
Analogous steps to handleaddress translation using PAE or four-level paging mechanisms can be derived byVMM developers according to the paging behavior defined in Chapter 3 of the Intel®64 and IA-32 Architectures Software Developer’s Manual, Volume 3A):1.
First consult the active PDE, which can be located using the upper 10 bits of thefaulting address and the current value of CR3. The active PDE is the source of thefault if it is marked not present or if its R/W bit and U/S bits are inconsistent withthe attempted guest access (the guest privilege level and the value of CR0:WPshould also be taken into account).2. If the active PDE is the source of the fault, consult the corresponding guest PDEusing the same 10 bits from the faulting address and the physical address thatcorresponds to the guest address in the guest CR3. If the guest PDE would causea page fault (for example: it is marked not present), then raise a page fault to theguest operating system.The following steps assume that the guest PDE would not have caused a pagefault.3.
If the active PDE is the source of the fault and the guest PDE contains, as pagetable base address (if PS = 0) or page base address (PS = 1), a guest addressthat the VMM has chosen not to support; then raise a machine check (or someother abort) to the guest operating system.The following steps assume that the guest address in the guest PDE is supportedfor the virtual machine.4. If the active PDE is marked not-present, then set the active PDE to correspond toguest PDE as follows:a.
If the active PDE contains a page-table base address (if PS = 0), thenallocate an aligned 4-KByte active page table marked completely invalid andset the page-table base address in the active PDE to be the physical addressof the newly allocated page table.26-8 Vol. 3VIRTUALIZATION OF SYSTEM RESOURCESb. If the active PDE contains a page base address (if PS = 1), then set the pagebase address in the active PDE to be the physical page base address thatcorresponds to the guest address in the guest PDE.c.Set the P, U/S, and PS bits in the active PDE to be identical to those in theguest PDE.d. Set the PWT, PCD, and G bits according to the policy of the VMM.e.
Set A = 1 in the guest PDE.f.If D = 1 in the guest PDE or PS = 0 (meaning that this PDE refers to a pagetable), then set the R/W bit in the active PDE as in the guest PDE.g. If D = 0 in the guest PDE, PS = 1 (this is a 4-MByte page), and the attemptedaccess is a write; then set R/W in the active PDE as in the guest PDE and setD = 1 in the guest PDE.h. If D = 0 in the guest PDE, PS = 1, and the attempted access is not a write;then set R/W = 0 in the active PDE.i.After modifying the active PDE, re-execute the faulting instruction.The remaining steps assume that the active PDE is already marked present.5. If the active PDE is the source of the fault, the active PDE refers to a 4-MBytepage (PS = 1), the attempted access is a write; D = 0 in the guest PDE, and theactive PDE has caused a fault solely because it has R/W = 0; then set R/W in theactive PDE as in the guest PDE; set D = 1 in the guest PDE, and re-execute thefaulting instruction.6.
If the active PDE is the source of the fault and none of the above cases apply,then raise a page fault of the guest operating system.The remaining steps assume that the source of the original page fault is not theactive PDE.NOTEIt is possible that the active PDE might be causing a fault eventhough the guest PDE would not. However, this can happen only if theguest operating system increased access in the guest PDE and didnot take action to ensure that older translations were flushed fromthe TLB. Such translations might have caused a page fault if theguest software were running on bare hardware.7. If the active PDE refers to a 4-MByte page (PS = 1) but is not the source of thefault, then the fault resulted from an inconsistency between the active page-tablehierarchy and the processor’s TLB.
Since the transition to the VMM caused anaddress-space change and flushed the processor’s TLB, the VMM can simply reexecute the faulting instruction.The remaining steps assume that PS = 0 in the active and guest PDEs.Vol. 3 26-9VIRTUALIZATION OF SYSTEM RESOURCES8. Consult the active PTE, which can be located using the next 10 bits of the faultingaddress (bits 21–12) and the physical page-table base address in the active PDE.The active PTE is the source of the fault if it is marked not-present or if its R/W bitand U/S bits are inconsistent with the attempted guest access (the guestprivilege level and the value of CR0:WP should also be taken into account).9. If the active PTE is not the source of the fault, then the fault has resulted from aninconsistency between the active page-table hierarchy and the processor’s TLB.Since the transition to the VMM caused an address-space change and flushed theprocessor’s TLB, the VMM simply re-executes the faulting instruction.The remaining steps assume that the active PTE is the source of the fault.10.
Consult the corresponding guest PTE using the same 10 bits from the faultingaddress and the physical address that correspond to the guest page-table baseaddress in the guest PDE. If the guest PTE would cause a page fault (it is markednot-present), the raise a page fault to the guest operating system.The following steps assume that the guest PTE would not have caused a pagefault.11. If the guest PTE contains, as page base address, a physical address that is notvalid for the virtual machine being supported; then raise a machine check (orsome other abort) to the guest operating system.The following steps assume that the address in the guest PTE is valid for thevirtual machine.12.
If the active PTE is marked not-present, then set the active PTE to correspond toguest PTE:a. Set the page base address in the active PTE to be the physical address thatcorresponds to the guest page base address in the guest PTE.b. Set the P, U/S, and PS bits in the active PTE to be identical to those in theguest PTE.c.Set the PWT, PCD, and G bits according to the policy of the VMM.d. Set A = 1 in the guest PTE.e.
If D = 1 in the guest PTE, then set the R/W bit in the active PTE as in theguest PTE.f.If D = 0 in the guest PTE and the attempted access is a write, then set R/W inthe active PTE as in the guest PTE and set D = 1 in the guest PTE.g.
If D = 0 in the guest PTE and the attempted access is not a write, then setR/W = 0 in the active PTE.h. After modifying the active PTE, re-execute the faulting instruction.The remaining steps assume that the active PTE is already marked present.13. If the attempted access is a write, D = 0 (not dirty) in the guest PTE and theactive PTE has caused a fault solely because it has R/W = 0 (read-only); then set26-10 Vol. 3VIRTUALIZATION OF SYSTEM RESOURCESR/W in the active PTE as in the guest PTE, set D = 1 in the guest PTE and reexecute the faulting instruction.14. If none of the above cases apply, then raise a page fault of the guest operatingsystem.26.3.5.3Response to Uses of INVLPGOperating-systems can use INVLPG to flush entries from the TLB. This instructiontakes a linear address as an operand and software expects any cached translationsfor the address to be flushed.
VMM should set the processor-based VMCS executioncontrol invplg-exiting = 1, such that any attempts by a privileged guest to executeINVLPG will trap to the VMM (attempts to execute INVLPG by unprivileged guest aremanaged by the exception bitmap control in the VMCS). The VMM can then modifythe active page-table hierarchy to emulate the desired effect of the INVLPG.The following steps are performed. Note that these steps are performed only if theguest invocation of INVLPG would not fault and only if the guest software is runningat privilege level 0:1. Locate the relevant active PDE using the upper 10 bits of the operand addressand the current value of CR3.