Volume 3B System Programming Guide_ Part 2 (794104), страница 16
Текст из файла (страница 16)
3 18-73DEBUGGING AND PERFORMANCE MONITORINGIA32_DS_AREA MSRDS Buffer Management AreaBTS Buffer Base0HBTS Index8HBTS BufferBranch Record 0BTS AbsoluteMaximumBTS InterruptThreshold10HBranch Record 118HPEBS Buffer Base 20HPEBS IndexPEBS AbsoluteMaximumPEBS InterruptThresholdPEBSCounter ResetReserved28H30HBranch Record n38H40H48HPEBS Buffer50HPEBS Record 0PEBS Record 1PEBS Record nFigure 18-27. IA-32e Mode DS Save AreaWhen IA-32e mode is active, the structure of a branch trace record is similar to thatshown in Figure 18-25, but each field is 8 bytes in length.
This makes each BTSrecord 24 bytes (see Figure 18-28). The structure of a PEBS record is similar to thatshown in Figure 18-26, but each field is 8 bytes in length and architectural statesinclude register R8 through R15. This makes the size of a PEBS record in 64-bit mode144 bytes (see Figure 18-29).18-74 Vol. 3DEBUGGING AND PERFORMANCE MONITORING6340Last Branch From0HLast Branch To8H10HBranch PredictedFigure 18-28. 64-bit Branch Trace Record Format630RFLAGS0HRIP8HRAX10HRBX18HRCX20HRDX28HRSI30HRDI38HRBP40HRSP48HR850H......R1588HFigure 18-29. 64-bit PEBS Record Format18.15.6 Programming the Performance Countersfor Non-Retirement EventsThe basic steps to program a performance counter and to count events include thefollowing:1. Select the event or events to be counted.2.
For each event, select an ESCR that supports the event using the values in theESCR restrictions row in Table A-5, Appendix A.Vol. 3 18-75DEBUGGING AND PERFORMANCE MONITORING3. Match the CCCR Select value and ESCR name in Table A-5 to a value listed inTable 18-17; select a CCCR and performance counter.4. Set up an ESCR for the specific event or events to be counted and the privilegelevels at which the are to be counted.5. Set up the CCCR for the performance counter by selecting the ESCR and thedesired event filters.6. Set up the CCCR for optional cascading of event counts, so that when theselected counter overflows its alternate counter starts.7.
Set up the CCCR to generate an optional performance monitor interrupt (PMI)when the counter overflows. If PMI generation is enabled, the local APIC must beset up to deliver the interrupt to the processor and a handler for the interruptmust be in place.8. Enable the counter to begin counting.18.15.6.1 Selecting Events to CountTable A-6 in Appendix A lists a set of at-retirement events for the Pentium 4 and IntelXeon processors. For each event listed in Table A-6, setup information is provided.Table 18-18 gives an example of one of the events.Table 18-18.
Event ExampleEvent NameEvent ParametersParameter Valuebranch_retiredCounts the retirement of a branch.Specify one or more mask bits toselect any combination of branchtaken, not-taken, predicted andmispredicted.ESCR restrictionsMSR_CRU_ESCR2MSR_CRU_ESCR3See Table 15-3 for the addresses ofthe ESCR MSRsCounter numbersper ESCRESCR2: 12, 13, 16The counter numbers associatedwith each ESCR are provided. Theperformance counters andcorresponding CCCRs can be obtainedfrom Table 15-3.ESCR Event Select06HESCR3: 14, 15, 17Bit 0: MMNPBranch Not-taken Predicted,1: MMNMBranch Not-taken Mispredicted,2: MMTPBranch Taken Predicted,3: MMTMCCCR SelectESCR[31:25]ESCR[24:9],ESCR Event Mask18-76 Vol. 3Description05HBranch Taken Mispredicted.CCCR[15:13]DEBUGGING AND PERFORMANCE MONITORINGTable 18-18. Event Example (Contd.)Event NameEvent ParametersParameter ValueEvent SpecificNotesCan Support PEBSDescriptionP6: EMON_BR_INST_RETIREDNoRequires Additional NoMSRs for TaggingFor Table A-5 and Table A-6, Appendix A, the name of the event is listed in the EventName column and parameters that define the event and other information are listedin the Event Parameters column.
The Parameter Value and Description columns givespecific parameters for the event and additional description information. Entries inthe Event Parameters column are described below.•ESCR restrictions — Lists the ESCRs that can be used to program the event.Typically only one ESCR is needed to count an event.•Counter numbers per ESCR — Lists which performance counters areassociated with each ESCR. Table 18-17 gives the name of the counter and CCCRfor each counter number.
Typically only one counter is needed to count the event.•ESCR event select — Gives the value to be placed in the event select field of theESCR to select the event.•ESCR event mask — Gives the value to be placed in the Event Mask field of theESCR to select sub-events to be counted. The parameter value column definesthe documented bits with relative bit position offset starting from 0, where theabsolute bit position of relative offset 0 is bit 9 of the ESCR.
All undocumentedbits are reserved and should be set to 0.•CCCR select — Gives the value to be placed in the ESCR select field of the CCCRassociated with the counter to select the ESCR to be used to define the event.This value is not the address of the ESCR; it is the number of the ESCR from theNumber column in Table 18-17.•Event specific notes — Gives additional information about the event, such asthe name of the same or a similar event defined for the P6 family processors.•Can support PEBS — Indicates if PEBS is supported for the event (only suppliedfor at-retirement events listed in Table A-6.)•Requires additional MSR for tagging — Indicates which if any additionalMSRs must be programmed to count the events (only supplied for the atretirement events listed in Table A-6.)NOTEThe performance-monitoring events listed in Appendix A, “Performance-Monitoring Events,” are intended to be used as guides forperformance tuning.
The counter values reported are not guaranteedVol. 3 18-77DEBUGGING AND PERFORMANCE MONITORINGto be absolutely accurate and should be used as a relative guide fortuning. Known discrepancies are documented where applicable.The following procedure shows how to set up a performance counter for basiccounting; that is, the counter is set up to count a specified event indefinitely, wrapping around whenever it reaches its maximum count.
This procedure is continuedthrough the following four sections.Using information in Table A-5, Appendix A, an event to be counted can be selectedas follows:1. Select the event to be counted.2. Select the ESCR to be used to select events to be counted from the ESCRs field.3. Select the number of the counter to be used to count the event from the CounterNumbers Per ESCR field.4.
Determine the name of the counter and the CCCR associated with the counter,and determine the MSR addresses of the counter, CCCR, and ESCR from Table18-17.5. Use the WRMSR instruction to write the ESCR Event Select and ESCR Event Maskvalues into the appropriate fields in the ESCR. At the same time set or clear theUSR and OS flags in the ESCR as desired.6. Use the WRMSR instruction to write the CCCR Select value into the appropriatefield in the CCCR.NOTETypically all the fields and flags of the CCCR will be written with oneWRMSR instruction; however, in this procedure, several WRMSRwrites are used to more clearly demonstrate the uses of the variousCCCR fields and flags.This setup procedure is continued in the next section, Section 18.15.6.2, “FilteringEvents.”18.15.6.2 Filtering EventsEach counter receives up to 4 input lines from the processor hardware from which itis counting events. The counter treats these inputs as binary inputs (input 0 has avalue of 1, input 1 has a value of 2, input 3 has a value of 4, and input 3 has a valueof 8).
When a counter is enabled, it adds this binary input value to the counter valueon each clock cycle. For each clock cycle, the value added to the counter can thenrange from 0 (no event) to 15.For many events, only the 0 input line is active, so the counter is merely counting theclock cycles during which the 0 input is asserted. However, for some events two ormore input lines are used. Here, the counters threshold setting can be used to filter18-78 Vol. 3DEBUGGING AND PERFORMANCE MONITORINGevents. The compare, complement, threshold, and edge fields control the filtering ofcounter increments by input value.If the compare flag is set, then a “greater than” or a “less than or equal to” comparison of the input value vs. a threshold value can be made. The complement flagselects “less than or equal to” (flag set) or “greater than” (flag clear). The thresholdfield selects a threshold value of from 0 to 15. For example, if the complement flag iscleared and the threshold field is set to 6, than any input value of 7 or greater on the4 inputs to the counter will cause the counter to be incremented by 1, and any valueless than 7 will cause an increment of 0 (or no increment) of the counter.
Conversely,if the complement flag is set, any value from 0 to 6 will increment the counter andany value from 7 to 15 will not increment the counter. Note that when a thresholdcondition has been satisfied, the input to the counter is always 1, not the input valuethat is presented to the threshold filter.The edge flag provides further filtering of the counter inputs when a thresholdcomparison is being made.
The edge flag is only active when the compare flag is set.When the edge flag is set, the resulting output from the threshold filter (a value of 0or 1) is used as an input to the edge filter. Each clock cycle, the edge filter examinesthe last and current input values and sends a count to the counter only when itdetects a “rising edge” event; that is, a false-to-true transition. Figure 18-30 illustrates rising edge filtering.The following procedure shows how to configure a CCCR to filter events using thethreshold filter and the edge filter. This procedure is a continuation of the setupprocedure introduced in Section 18.15.6.1, “Selecting Events to Count.”7. (Optional) To set up the counter for threshold filtering, use the WRMSRinstruction to write values in the CCCR compare and complement flags and thethreshold field:— Set the compare flag.— Set or clear the complement flag for less than or equal to or greater thancomparisons, respectively.— Enter a value from 0 to 15 in the threshold field.8.