Volume 3B System Programming Guide_ Part 2 (794104), страница 83
Текст из файла (страница 83)
It mayor may not be being driven by thisprocessor.Asserted two processor clockcycles for partial transactions and4 processor clocks (usually inconsecutive bus clocks) for fullline transactions.A-70 Vol. 3PERFORMANCE-MONITORING EVENTSTable A-5. Performance Monitoring Events Supported by Intel NetBurstMicroarchitecture for Non-Retirement Counting (Contd.)Event NameEvent ParametersParameter ValueDescriptionCCCR Select06HCCCR[15:13]Event SpecificNotesSpecify edge trigger in the CCCRMSR to avoid double counting.DRDY_OWN and DRDY_OTHER aremutually exclusive; similarly forDBSY_OWN and DBSY_OTHER.BSQ_allocationThis event counts allocations inthe Bus Sequence Unit (BSQ)according to the specified maskbit encoding.
The event mask bitsconsist of four sub-groups:••••request type,request lengthmemory typeand sub-group consistingmostly of independent bits(bits 5, 6, 7, 8, 9, and 10)Specify an encoding for each subgroup.ESCR restrictionsMSR_BSU_ESCR0Counter numbersper ESCRESCR0: 0, 1ESCR Event Select05HESCR[31:25]ESCR Event MaskBitESCR[24:9]0: REQ_TYPE01: REQ_TYPE1Request type encoding (bit 0 and1) are:0 – Read (excludes readinvalidate)1 – Read invalidate2 – Write (other thanwritebacks)3 – Writeback (evicted fromcache).
(public)2: REQ_LEN03: REQ_LEN1Request length encoding (bit 2, 3)are:0 – 0 chunks1 – 1 chunks3 – 8 chunksVol. 3 A-71PERFORMANCE-MONITORING EVENTSTable A-5. Performance Monitoring Events Supported by Intel NetBurstMicroarchitecture for Non-Retirement Counting (Contd.)Event NameEvent ParametersCCCR SelectEvent SpecificNotesA-72 Vol. 3Parameter ValueDescription5: REQ_IO_TYPERequest type is input or output.6: REQ_LOCK_TYPERequest type is bus lock.7: REQ_CACHE_TYPERequest type is cacheable.8: REQ_SPLIT_TYPERequest type is a bus 8-bytechunk split across 8-byteboundary.9: REQ_DEM_TYPERequest type is a demand if set.Request type is HW.SW prefetchif 0.10: REQ_ORD_TYPERequest is an ordered type.11: MEM_TYPE012: MEM_TYPE113: MEM_TYPE2Memory type encodings (bit11-13) are:07HCCCR[15:13]0 – UC1 – USWC4 – WT5 – WP6 – WB1: Specify edge trigger in CCCR toavoid double counting.2: A writebacks to 3rd level cachefrom 2nd level cache counts asa separate entry, this is inadditional to the entryallocated for a request to thebus.3: A read request to WB memorytype results in a request to the64-byte sector, containing thetarget address, followed by aprefetch request to anadjacent sector.PERFORMANCE-MONITORING EVENTSTable A-5.
Performance Monitoring Events Supported by Intel NetBurstMicroarchitecture for Non-Retirement Counting (Contd.)Event NameEvent ParametersParameter ValueDescription4: For Pentium 4 and Xeonprocessors with CPUID modelencoding value equals to 0 and1, an allocated BSQ entryincludes both the demandsector and prefetched 2ndsector.5: An allocated BSQ entry for adata chunk is any request lessthan 64 bytes.6a:This event may undercount forrequests of split typetransactions if the dataaddress straddled acrossmodulo-64 byte boundary.6b:This event may undercount forrequests of read request of16-byte operands from WC orUC address.6c: This event may undercount WCpartial requests originatedfrom store operands that aredwords.bsq_active_entriesThis event represents the numberof BSQ entries (clipped at 15)currently active (valid) which meetthe subevent mask criteria duringallocation in the BSQ.
Activerequest entries are allocated onthe BSQ until de-allocated.De-allocation of an entry does notnecessarily imply the request isfilled. This event must beprogrammed in conjunction withBSQ_allocation. Specify one ormore event mask bits to selectthe transactions that is counted.ESCR restrictionsESCR1Counter numbersper ESCRESCR1: 2, 3Vol. 3 A-73PERFORMANCE-MONITORING EVENTSTable A-5. Performance Monitoring Events Supported by Intel NetBurstMicroarchitecture for Non-Retirement Counting (Contd.)Event NameEvent ParametersParameter ValueDescriptionESCR Event Select06HESCR[30:25]07HCCCR[15:13]ESCR Event MaskCCCR SelectEvent SpecificNotesESCR[24:9]1: Specified desired mask bits inESCR0 and ESCR1.2: See the BSQ_allocation eventfor descriptions of the maskbits.3: Edge triggering should not beused when counting cycles.4: This event can be used toestimate the latency of atransaction from allocation tode-allocation in the BSQ.
Thelatency observed byBSQ_allocation includes thelatency of FSB, plus additionaloverhead.5: Additional overhead mayinclude the time it takes toissue two requests (the sectorby demand and the adjacentsector via prefetch). Sinceadjacent sector prefetcheshave lower priority thatdemand fetches, on a heavilyused system there is a highprobability that the adjacentsector prefetch will have towait until the next busarbitration.6: For Pentium 4 and Xeonprocessors with CPUID modelencoding value less than 3, thisevent is updated every clock.7: For Pentium 4 and Xeonprocessors with CPUID modelencoding value equals to 3 or 4,this event is updated everyother clock.A-74 Vol. 3PERFORMANCE-MONITORING EVENTSTable A-5.
Performance Monitoring Events Supported by Intel NetBurstMicroarchitecture for Non-Retirement Counting (Contd.)Event NameEvent ParametersParameter ValueSSE_input_assistDescriptionThis event counts the number oftimes an assist is requested tohandle problems with inputoperands for SSE/SSE2/SSE3operations; most notablydenormal source operands whenthe DAZ bit is not set. Set bit 15of the event mask to use thisevent.ESCR restrictionsMSR_FIRM_ESCR0MSR_FIRM_ESCR1Counter numbersper ESCRESCR0: 8, 9ESCR Event Select34HESCR1: 10, 11ESCR Event MaskCCCR SelectEvent SpecificNotesESCR[31:25]ESCR[24:9]15: ALLCount assists for SSE/SSE2/SSE3μops.01HCCCR[15:13]1: Not all requests for assists areactually taken.
This event isknown to overcount in that itcounts requests for assistsfrom instructions on the nonretired path that do not incur aperformance penalty. An assistis actually taken only for nonbogus μops. Any appreciablecounts for this event are anindication that the DAZ or FTZbit should be set and/or thesource code should be changedto eliminate the condition.Vol. 3 A-75PERFORMANCE-MONITORING EVENTSTable A-5. Performance Monitoring Events Supported by Intel NetBurstMicroarchitecture for Non-Retirement Counting (Contd.)Event NameEvent ParametersParameter ValueDescription2: Two common situations for anSSE/SSE2/SSE3 operationneeding an assist are: (1) whena denormal constant is used asan input and the DenormalsAre-Zero (DAZ) mode is notset, (2) when the input operanduses the underflowed result ofa previous SSE/SSE2/SSE3operation and neither the DAZnor Flush-To-Zero (FTZ) modesare set.3: Enabling the DAZ modeprevents SSE/SSE2/SSE3operations from needingassists in the first situation.Enabling the FTZ modeprevents SSE/SSE2/SSE3operations from needingassists in the second situation.packed_SP_uopThis event increments for eachpacked single-precision μop,specified through the event maskfor detection.ESCR restrictionsMSR_FIRM_ESCR0MSR_FIRM_ESCR1Counter numbersper ESCRESCR0: 8, 9ESCR Event Select08HESCR1: 10, 11ESCR Event MaskCCCR SelectA-76 Vol.
3ESCR[31:25]ESCR[24:9]Bit 15: ALLCount all μops operating onpacked single-precision operands.01HCCCR[15:13]PERFORMANCE-MONITORING EVENTSTable A-5. Performance Monitoring Events Supported by Intel NetBurstMicroarchitecture for Non-Retirement Counting (Contd.)Event NameEvent ParametersParameter ValueEvent SpecificNotesDescription1: If an instruction contains morethan one packed SP μops, eachpacked SP μop that is specifiedby the event mask will becounted.2: This metric counts instances ofpacked memory μops in arepeat move string.packed_DP_uopThis event increments for eachpacked double-precision μop,specified through the event maskfor detection.ESCR restrictionsMSR_FIRM_ESCR0MSR_FIRM_ESCR1Counter numbersper ESCRESCR0: 8, 9ESCR Event Select0CHESCR1: 10, 11ESCR Event MaskCCCR SelectESCR[31:25]ESCR[24:9]Bit 15: ALLCount all μops operating onpacked double-precision operands.01HCCCR[15:13]Event SpecificNotesIf an instruction contains morethan one packed DP μops, eachpacked DP μop that is specified bythe event mask will be counted.scalar_SP_uopThis event increments for eachscalar single-precision μop,specified through the event maskfor detection.ESCR restrictionsMSR_FIRM_ESCR0MSR_FIRM_ESCR1Counter numbersper ESCRESCR0: 8, 9ESCR Event Select0AHESCR1: 10, 11ESCR[31:25]Vol.
3 A-77PERFORMANCE-MONITORING EVENTSTable A-5. Performance Monitoring Events Supported by Intel NetBurstMicroarchitecture for Non-Retirement Counting (Contd.)Event NameEvent ParametersParameter ValueESCR Event MaskCCCR SelectDescriptionESCR[24:9]Bit 15: ALLCount all μops operating on scalarsingle-precision operands.01HCCCR[15:13]Event SpecificNotesIf an instruction contains morethan one scalar SP μops, eachscalar SP μop that is specified bythe event mask will be counted.scalar_DP_uopThis event increments for eachscalar double-precision μop,specified through the event maskfor detection.ESCR restrictionsMSR_FIRM_ESCR0MSR_FIRM_ESCR1Counter numbersper ESCRESCR0: 8, 9ESCR Event Select0EHESCR1: 10, 11ESCR Event MaskCCCR SelectESCR[31:25]ESCR[24:9]Bit 15: ALLCount all μops operating on scalardouble-precision operands.01HCCCR[15:13]Event SpecificNotesIf an instruction contains morethan one scalar DP μops, eachscalar DP μop that is specified bythe event mask is counted.64bit_MMX_uopThis event increments for eachMMX instruction, which operateon 64-bit SIMD operands.ESCR restrictionsMSR_FIRM_ESCR0MSR_FIRM_ESCR1A-78 Vol.