Volume 3B System Programming Guide_ Part 2 (794104), страница 88
Текст из файла (страница 88)
3 A-115PERFORMANCE-MONITORING EVENTSTable A-14. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.)UnitEvent Mnemonic EventNum. NameUnitMask10H00HFP_COMP_OPS_EXEDescriptionCommentsNumber of computationalfloating-point operationsexecuted.Counter 0 only.The number of FADD,FSUB, FCOM, FMULs,integer MULs and IMULs,FDIVs, FPREMs, FSQRTS,integer DIVs, and IDIVs.This number does notinclude the number ofcycles, but the number ofoperations.This event does notdistinguish an FADD usedin the middle of atranscendental flow froma separate FADDinstruction.11H12HFP_ASSISTMUL00H00HNumber of floating-pointexception cases handledby microcode.Counter 1 only.Number of multiplies.Counter 1 only.This event includescounts due tospeculativeexecution.This count includesinteger as well as FPmultiplies and isspeculative.13HDIV00HNumber of divides.This count includesinteger as well as FPdivides and isspeculative.A-116 Vol.
3Counter 1 only.PERFORMANCE-MONITORING EVENTSTable A-14. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.)UnitEvent Mnemonic EventNum. NameUnitMask14H00HCYCLES_DIV_BUSYDescriptionCommentsNumber of cycles duringwhich the divider is busy,and cannot accept newdivides.Counter 0 only.This includes integer andFP divides, FPREM,FPSQRT, etc. and isspeculative.MemoryOrdering03HLD_BLOCKS00HNumber of loadoperations delayed dueto store buffer blocks.Includes counts causedby preceding storeswhose addresses areunknown, precedingstores whose addressesare known but whosedata is unknown, andpreceding stores thatconflicts with the loadbut which incompletelyoverlap the load.04HSB_DRAINS00HNumber of store bufferdrain cycles.Incremented every cyclethe store buffer isdraining.Draining is caused byserializing operations likeCPUID, synchronizingoperations like XCHG,interruptacknowledgment, as wellas other conditions (suchas cache flushing).Vol.
3 A-117PERFORMANCE-MONITORING EVENTSTable A-14. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.)UnitEvent Mnemonic EventNum. NameUnitMask05H00HMISALIGN_MEM_REFDescriptionCommentsNumber of misaligneddata memory references.MISALIGN_MEM_REF is only anapproximation tothe true number ofmisaligned memoryreferences.Incremented by 1 everycycle, during which eitherthe processor’s load orstore pipeline dispatchesa misaligned μop.Counting is performed ifit is the first or secondhalf, or if it is blocked,squashed, or missed.In this context,misaligned meanscrossing a 64-bitboundary.07H4BHA-118 Vol. 3EMON_KNI_PREF_DISPATCHEDNumber of StreamingSIMD extensionsprefetch/weakly-orderedinstructions dispatched(speculative prefetchesare included in counting):00H0: prefetch NTA01H1: prefetch T102H2: prefetch T203H3: weakly ordered storesEMON_KNI_PREF_MISSNumber ofprefetch/weakly-orderedinstructions that miss allcaches:00H0: prefetch NTA01H1: prefetch T102H2: prefetch T203H3: weakly ordered storesThe value returnedis roughlyproportional to thenumber ofmisaligned memoryaccesses (the sizeof the problem).Counters 0 and 1.Pentium IIIprocessor only.Counters 0 and 1.Pentium IIIprocessor only.PERFORMANCE-MONITORING EVENTSTable A-14.
Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.)UnitInstructionDecodingandRetirementEvent Mnemonic EventNum. NameUnitMaskC0H00HINST_RETIREDDescriptionCommentsNumber of instructionsretired.A hardwareinterrupt receivedduring/after thelast iteration of theREP STOS flowcauses the counterto undercount by 1instruction.An SMI receivedwhile executing aHLT instruction willcause theperformancecounter to notcount the RSMinstruction andundercount by 1.C2HUOPS_RETIRED00HNumber of μops retired.D0HINST_DECODED00HNumber of instructionsdecoded.D8HEMON_KNI_INST_RETIREDNumber of StreamingSIMD extensions retired:00H01HD9HEMON_KNI_COMP_INST_RET0: packed & scalarCounters 0 and 1.Pentium IIIprocessor only.1: scalarNumber of StreamingSIMD extensionscomputation instructionsretired:00H0: packed and scalar01H1: scalarCounters 0 and 1.Pentium IIIprocessor only.Vol.
3 A-119PERFORMANCE-MONITORING EVENTSTable A-14. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.)UnitEvent Mnemonic EventNum. NameUnitMaskInterruptsC8HHW_INT_RX00HNumber of hardwareinterrupts received.C6HCYCLES_INT_MASKED00HNumber of processorcycles for whichinterrupts are disabled.C7HCYCLES_INT_PENDING_AND_MASKED00HNumber of processorcycles for whichinterrupts are disabledand interrupts arepending.C4HBR_INST_RETIRED00HNumber of branchinstructions retired.C5HBR_MISS_PRED_RETIRED00HNumber of mispredictedbranches retired.C9HBR_TAKEN_RETIRED00HNumber of takenbranches retired.CAHBR_MISS_PRED_TAKEN_RET00HNumber of takenmispredictions branchesretired.E0HBR_INST_DECODED00HNumber of branchinstructions decoded.E2HBTB_MISSES00HNumber of branches forwhich the BTB did notproduce a prediction.E4HBR_BOGUS00HNumber of bogusbranches.E6HBACLEARS00HNumber of timesBACLEAR is asserted.BranchesDescriptionThis is the number oftimes that a static branchprediction was made, inwhich the branchdecoder decided to makea branch predictionbecause the BTB did not.A-120 Vol.
3CommentsPERFORMANCE-MONITORING EVENTSTable A-14. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.)UnitEvent Mnemonic EventNum. NameUnitMaskStallsA2H00HRESOURCE_STALLSDescriptionCommentsIncremented by 1 duringevery cycle for whichthere is a resourcerelated stall.Includes registerrenaming buffer entries,memory buffer entries.Does not include stallsdue to bus queue full, toomany cache misses, etc.In addition to resourcerelated stalls, this eventcounts some otherevents.Includes stalls arisingduring branchmisprediction recovery,such as if retirement ofthe mispredicted branchis delayed and stallsarising while store bufferis draining fromsynchronizing operations.D2HPARTIAL_RAT_STALLS00HNumber of cycles orevents for partial stalls.This includes flag partialstalls.SegmentRegisterLoads06HSEGMENT_REG_LOADS00HNumber of segmentregister loads.Clocks79HCPU_CLK_UNHALTED00HNumber of cycles duringwhich the processor isnot halted.Vol.
3 A-121PERFORMANCE-MONITORING EVENTSTable A-14. Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.)UnitEvent Mnemonic EventNum. NameUnitMaskMMX UnitB0H00HMMX_INSTR_EXECDescriptionCommentsNumber of MMXInstructions Executed.Available in IntelCeleron, Pentium IIand Pentium II Xeonprocessors only.Does not accountfor MOVQ andMOVD stores fromregister to memory.B1HMMX_SAT_INSTR_EXEC00HNumber of MMXSaturating InstructionsExecuted.Available in PentiumII and Pentium IIIprocessors only.B2HMMX_UOPS_EXEC0FHNumber of MMX μopsExecuted.Available in PentiumII and Pentium IIIprocessors only.B3HMMX_INSTR_TYPE_EXEC01HMMX packed multiplyinstructions executed.02HMMX packed shiftinstructions executed.Available in PentiumII and Pentium IIIprocessors only.04HMMX pack operationinstructions executed.08HMMX unpack operationinstructions executed.10HMMX packed logicalinstructions executed.20HMMX packed arithmeticinstructions executed.00HTransitions from MMXinstruction to floatingpoint instructions.01HTransitions from floatingpoint instructions toMMX instructions.CCHA-122 Vol.
3FP_MMX_TRANSAvailable in PentiumII and Pentium IIIprocessors only.CDHMMX_ASSIST00HNumber of MMX Assists(that is, the number ofEMMS instructionsexecuted).Available in PentiumII and Pentium IIIprocessors only.CEHMMX_INSTR_RET00HNumber of MMXInstructions Retired.Available in PentiumII processors only.PERFORMANCE-MONITORING EVENTSTable A-14.
Events That Can Be Counted with the P6 Family PerformanceMonitoring Counters (Contd.)UnitSegmentRegisterRenamingEvent Mnemonic EventNum. NameD4HUnitMaskSEG_RENAME_STALLSDescriptionCommentsNumber of SegmentAvailable in PentiumRegister Renaming Stalls: II and Pentium IIIprocessors only.02HSegment register ES04HSegment register DS08HSegment register FS0FHSegment register FSSegment registersES + DS + FS + GSD5HD6HSEG_REG_RENAMESRET_SEG_RENAMESNumber of SegmentRegister Renames:01HSegment register ES02HSegment register DS04HSegment register FS08HSegment register FS0FHSegment registersES + DS + FS + GS00HNumber of segmentregister rename eventsretired.Available in PentiumII and Pentium IIIprocessors only.Available in PentiumII and Pentium IIIprocessors only.NOTES:1.
Several L2 cache events, where noted, can be further qualified using the Unit Mask (UMSK) fieldin the PerfEvtSel0 and PerfEvtSel1 registers. The lower 4 bits of the Unit Mask field are used inconjunction with L2 events to indicate the cache state or cache states involved.The P6 family processors identify cache states using the “MESI” protocol and consequently eachbit in the Unit Mask field represents one of the four states: UMSK[3] = M (8H) state, UMSK[2] = E(4H) state, UMSK[1] = S (2H) state, and UMSK[0] = I (1H) state.
UMSK[3:0] = MESI” (FH) should beused to collect data for all states; UMSK = 0H, for the applicable events, will result in nothingbeing counted.2. All of the external bus logic (EBL) events, except where noted, can be further qualified using theUnit Mask (UMSK) field in the PerfEvtSel0 and PerfEvtSel1 registers.Bit 5 of the UMSK field is used in conjunction with the EBL events to indicate whether the processor should count transactions that are self- generated (UMSK[5] = 0) or transactions thatresult from any processor on the bus (UMSK[5] = 1).3. L2 cache locks, so it is possible to have a zero count.Vol.
3 A-123PERFORMANCE-MONITORING EVENTSA.7PENTIUM PROCESSOR PERFORMANCEMONITORING EVENTSTable A-15 lists the events that can be counted with the performance-monitoringcounters for the Pentium processor. The Event Number column gives the hexadecimal code that identifies the event and that is entered in the ES0 or ES1 (eventselect) fields of the CESR MSR. The Mnemonic Event Name column gives the name ofthe event, and the Description and Comments columns give detailed descriptions ofthe events. Most events can be counted with either counter 0 or counter 1; however,some events can only be counted with only counter 0 or only counter 1 (as noted).NOTEThe events in the table that are shaded are implemented only in thePentium processor with MMX technology.Table A-15. Events That Can Be Counted with Pentium ProcessorPerformance-Monitoring CountersEventNum.Mnemonic EventNameDescriptionComments00HDATA_READNumber of memory datareads (internal datacache hit and misscombined)Split cycle reads are countedindividually.
Data Memory Reads thatare part of TLB miss processing arenot included. These events mayoccur at a maximum of two per clock.I/O is not included.01HDATA_WRITENumber of memory datawrites (internal datacache hit and misscombined); I/O notincludedSplit cycle writes are countedindividually. These events may occurat a maximum of two per clock. I/O isnot included.0H2DATA_TLB_MISSNumber of misses to thedata cache translationlook-aside bufferA-124 Vol. 3PERFORMANCE-MONITORING EVENTSTable A-15.
Events That Can Be Counted with Pentium ProcessorPerformance-Monitoring Counters (Contd.)EventNum.03HMnemonic EventNameDATA_READ_MISSDescriptionCommentsNumber of memory readaccesses that miss theinternal data cachewhether or not theaccess is cacheable ornoncacheableAdditional reads to the same cacheline after the first BRDY# of theburst line fill is returned but beforethe final (fourth) BRDY# has beenreturned, will not cause the counterto be incremented additional times.Data accesses that are part of TLBmiss processing are not included.Accesses directed to I/O space arenot included.04HDATA WRITE MISSNumber of memorywrite accesses that missthe internal data cachewhether or not theaccess is cacheable ornoncacheableData accesses that are part of TLBmiss processing are not included.Accesses directed to I/O space arenot included.05HWRITE_HIT_TO_M-_OR_ESTATE_LINESNumber of write hits toexclusive or modifiedlines in the data cacheThese are the writes that may beheld up if EWBE# is inactive.