Volume 2 System Programming (794096), страница 87
Текст из файла (страница 87)
Writing the performance counters can be useful ifsoftware wants to count a specific number of events, and then trigger an interrupt when that count isreached. An interrupt can be triggered when a performance counter overflows (see “CounterOverflow” on page 344 for additional information). Software should use the WRMSR instruction toload the count as a two’s-complement negative number into the performance counter. This causes thecounter to overflow after counting the appropriate number of times.The performance counters are not guaranteed to produce identical measurements each time they areused to measure a particular instruction sequence, and they should not be used to take measurements ofvery small instruction sequences.
The RDPMC instruction is not serializing, and it can be executedout-of-order with respect to other instructions around it. Even when bound by serializing instructions,the system environment at the time the instruction is executed can cause events to be counted beforethe counter value is loaded into EDX:EAX.13.3.2 Performance Event-Select RegistersPerformance event-select registers (PerfEvtSeln) are 64-bit registers used to specify the events countedby the performance counters, and to control other aspects of their operation. Each performance countersupported by the implementation has a corresponding event-select register that controls its operation.Figure 13-7 on page 342 shows the format of the PerfEvtSeln register.Debug and Performance Resources341AMD64 Technology24593—Rev. 3.13—July 20076342 41 40 39H GO OReserved31Counter MaskBits63-42414039-3635-3231-24232221201918171615-87-0MnemonicHOGOINVENReservedINTPCEOSUSR24 23 22 21 20 19 18 17 16 15IR IUPOENe NESNCSVs TR8Unit MaskDescriptionReservedHost OnlyGuest OnlyReservedEvent Mask[11-8]Counter MaskInvert MaskCounter EnableR/WInterrupt EnablePin ControlEdge DetectOperating-System ModeUser ModeUnit MaskEvent Mask[7–0]R/WR/WR/WR/WR/WR/WR/W36 35Reserved32EventMask[11–8]70Event Mask[7–0]R/WR/WR/WR/WR/WR/WFigure 13-7.
Performance Event-Select Register (PerfEvtSeln)The fields within the PerfEvtSeln register are:•••Event Mask[11:8]—Bits 35-32, read/write. This field extends the Event Mask (bits 7-0) from 8 bitsto 12 bits. See Event MaskGuest Only (GO)—Bit 41, read/write. Software sets this bit to 1 to enable counting in thecorresponding PerfCtrn when the processor is in guest mode. Clearing this bit to 0 disablescounting in the corresponding PerfCtrn when the processor is in guest mode. If GO = HO = 1, or ifGO = HO = 0, counting in the corresponding PerfCtrn is enabled when the processor is in eitherguest mode or host mode.Host Only (HO)–Bit 40, read/write. Software sets this bit to 1 to enable counting in thecorresponding PerfCtrn when the processor is in host mode.
Clearing this bit to 0 disables countingin the corresponding PerfCtrn when the processor is in host mode. If GO = HO = 1, or if GO = HO= 0, counting in the corresponding PerfCtrn is enabled when the processor is in either guest modeor host mode.342Debug and Performance Resources24593—Rev. 3.13—July 2007•••AMD64 TechnologyEvent Mask—Bits 7–0, read/write. This field specifies both the event or event duration to becounted by the corresponding PerfCtrn register. The events that can be counted are implementationdependent. For more information, refer to the BIOS writer’s guide for the implementation.Unit Mask—Bits 15–8, read/write.
This field can be used to specify a particular processor unit tobe monitored, if the event counted can be produced by multiple units within the processor.Implementations can also use this field to further specify or qualify a monitored event.Operating-System Mode (OS) and User Mode (USR)—Bits 17–16 (respectively), read/write.Software uses these bits to control the privilege level at which event counting is performedaccording to Table 13-3.Table 13-3.
Operating-System Mode and User Mode Bits••••••OS Mode(Bit 17)USR Mode(Bit 16)00No counting.01Only at CPL > 0.10Only at CPL = 0.11At all privilege levels.Event CountingEdge Detect (E)—Bit 18, read/write. Software sets this bit to 1 to count the number of edgetransitions from the negated to asserted state.
This feature is useful when coupled with eventduration monitoring, as it can be used to calculate the average time spent in an event. Clearing thisbit to 0 disables edge detection.Pin Control (PC)—Bit 19, read/write. Software sets this bit to 1 to cause the external PMi pins onthe processor to toggle when the counter overflows. When this bit is cleared to 0, the processortoggles the PMi pins each time it increments the performance counter.Interrupt Enable (INT)—Bit 20, read/write.
Software sets this bit to 1 to enable an interrupt tooccur when the performance counter overflows (see “Counter Overflow” on page 344 foradditional information). Clearing this bit to 0 disables the triggering of the interrupt.Counter Enable (EN)—Bit 22, read/write. Software sets this bit to 1 to enable the PerfEvtSelnregister, and counting in the corresponding PerfCtrn register.
Clearing this bit to 0 disables theregister pair.Invert Mask (INV)—Bit 23, read/write. Software sets this bit to 1 to invert the comparison resultperformed on the counter-mask field, so that a less-than or equal-to comparison can be performed.Clearing this bit to 0 leaves the comparison result alone, so that a greater-than or equal-tocomparison is reported.Counter Mask—Bits 31–24, read/write.
This field is used to set a threshold for counting multipleevents that can occur in a single clock. If the number of events occurring in the single clock isgreater than or equal to this field, the corresponding PerfCtrn register is incremented. PerfCtrn isnot incremented if the number of events is less than the count mask.Debug and Performance Resources343AMD64 Technology24593—Rev. 3.13—July 2007The INV bit, when set, causes the PerfCtrn register to be incremented when the comparison is lessthan or equal to the count mask. In this case, PerfCtrn is not incremented if the number of events isgreater than the count mask.The performance event-select registers can be read and written only by system software running atCPL = 0 using the RDMSR and WRMSR instructions, respectively. Any attempt to read or write theseregisters at CPL > 0 causes a general-protection exception to occur.13.3.3 Using Performance CountersStarting and Stopping.
Performance counting in a PerfCtrn register is initiated by setting thecorresponding PerfEvtSeln.EN bit to 1. Counting is stopped by clearing PerfEvtSeln.EN to 0.Software must initialize the remaining PerfEvtSeln fields with the appropriate setup informationbefore or at the same time EN is set. Counting begins when the WRMSR instruction that setsPerfEvtSeln.EN to 1 completes execution. Counting stops when the WRMSR instruction that clearsPerfEvtSeln.EN to 0 completes execution.Counter Overflow.
Some processor implementations support an interrupt-on-overflow capabilitythat allows an interrupt to occur when one of the PerfCtrn registers overflows. The source and type ofinterrupt is implementation dependent. Some implementations cause a debug interrupt to occur, whileothers make use of the local APIC to specify the interrupt vector and trigger the interrupt when anoverflow occurs.
Software controls the triggering of an interrupt by setting or clearing thePerfEvtSeln.INT bit.If system software makes use of the interrupt-on-overflow capability, an interrupt handler must beprovided that can record information relevant to the counter overflow. Before returning from theinterrupt handler, the performance counter can be re-initialized to its previous state so that anotherinterrupt occurs when the appropriate number of events are counted.13.3.4 Time-Stamp CounterThe time-stamp counter (TSC) is used to count processor-clock cycles.
The TSC is cleared to 0 after aprocessor reset. After a reset, the TSC is incremented by one for every processor clock cycle. Eachtime the TSC is read, it returns a monotonically-larger value than the previous value read from theTSC. When the TSC contains all ones, it wraps to zero. The TSC in a 1-GHz processor counts foralmost 600 years before it wraps. Figure 13-8 shows the format of the 64-bit time-stamp counter(TSC).630TSCFigure 13-8.Time-Stamp Counter (TSC)The TSC is a model-specific register that can also be read using one of the special read time-stampcounter instructions, RDTSC(Read Time-Stamp Counter (TSC)) or RDTSCP (Read Time-Stamp344Debug and Performance Resources24593—Rev.